Summary:
Unrolling Analyzer is already pretty complicated, and it becomes harder and harder to exercise it with usual IR tests, as with them we can only check the final decision: whether the loop is unrolled or not. This change factors this framework out from LoopUnrollPass to analyses, which allows to use unit tests.
The change itself is supposed to be NFC, except adding a couple of tests.
I plan to add more tests as I add new functionality and find/fix bugs.
Reviewers: chandlerc, hfinkel, sanjoy
Subscribers: zzheng, sanjoy, llvm-commits
Differential Revision: http://reviews.llvm.org/D16623
llvm-svn: 260169
The Windows bots have been failing for the last two days, with:
FAILED: C:\PROGRA~2\MICROS~1.0\VC\bin\amd64\cl.exe -c LLVMContextImpl.cpp
D:\buildslave\clang-x64-ninja-win7\llvm\lib\IR\LLVMContextImpl.cpp(137) :
error C2248: 'llvm::TrailingObjects<llvm::AttributeSetImpl,
llvm::IndexAttrPair>::operator delete' :
cannot access private member declared in class 'llvm::AttributeSetImpl'
TrailingObjects.h(298) : see declaration of
'llvm::TrailingObjects<llvm::AttributeSetImpl,
llvm::IndexAttrPair>::operator delete'
AttributeImpl.h(213) : see declaration of 'llvm::AttributeSetImpl'
llvm-svn: 260053
-fsized-deallocation. Disable sized deallocation for all objects derived from
TrailingObjects, as we expect the storage allocated for these objects to be
larger than the size of their dynamic type.
llvm-svn: 259942
Unfortunately, ProgramInfo::ProcessId is signed on Unix and unsigned on
Windows, breaking the standard fix of using '0U' in the gtest
expectation.
llvm-svn: 259704
With this patch, the profile summary data will be available in indexed
profile data file so that profiler reader/compiler optimizer can start
to make use of.
Differential Revision: http://reviews.llvm.org/D16258
llvm-svn: 259626
differentiate between indirect references to functions an direct calls.
This doesn't do a whole lot yet other than change the print out produced
by the analysis, but it lays the groundwork for a very major change I'm
working on next: teaching the call graph to actually be a call graph,
modeling *both* the indirect reference graph and the call graph
simultaneously. More details on that in the next patch though.
The rest of this is essentially a bunch of over-engineering that won't
be interesting until the next patch. But this also isolates essentially
all of the churn necessary to introduce the edge abstraction from the
very important behavior change necessary in order to separately model
the two graphs. So it should make review of the subsequent patch a bit
easier at the cost of making this patch seem poorly motivated. ;]
Differential Revision: http://reviews.llvm.org/D16038
llvm-svn: 259463
With poorly chosen custom parameters, the line table encoding logic would
sometimes end up generating a special opcode bigger than 255, which is wrong.
The set of default parameters that LLVM uses isn't subject to this bug.
When carefully chosing the line table parameters, it's impossible to fall into the
corner case that this patch fixes. The standard however doesn't require that these
parameters be carefully chosen. And even if it did, we shouldn't generate broken
encoding.
Add a unittest for this specific encoding bug, and while at it, create some unit
tests for the encoding logic using different sets of parameters.
llvm-svn: 259334
Add an option to llvm-profdata merge for writing out sparse indexed
profiles. These profiles omit InstrProfRecords for functions which are
never executed.
Differential Revision: http://reviews.llvm.org/D16727
llvm-svn: 259258
This reverts commit r259117.
The LineInfo constructor is defined in the codeview library and we have
to link against it now. Doing that isn't trivial, so reverting for now.
llvm-svn: 259126
Adds a new family of .cv_* directives to LLVM's variant of GAS syntax:
- .cv_file: Similar to DWARF .file directives
- .cv_loc: Similar to the DWARF .loc directive, but starts with a
function id. CodeView line tables are emitted by function instead of
by compilation unit, so we needed an extra field to communicate this.
Rather than overloading the .loc direction further, we decided it was
better to have our own directive.
- .cv_stringtable: Emits the codeview string table at the current
position. Currently this just contains the filenames as
null-terminated strings.
- .cv_filechecksums: Emits the file checksum table for all files used
with .cv_file so far. There is currently no support for emitting
actual checksums, just filenames.
This moves the line table emission code down into the assembler. This
is in preparation for implementing the inlined call site line table
format. The inline line table format encoding algorithm requires knowing
the absolute code offsets, so it must run after the assembler has laid
out the code.
David Majnemer collaborated on this patch.
llvm-svn: 259117
at least as big as the mach header to be identified as a Mach-O file and
make sure smaller files are not identified as a Mach-O files but as
unknown files. Also fix identify_magic() so it looks at all 4 bytes of
the filetype field when determining the type of the Mach-O file.
Then fix the macho-invalid-header test case to check that it is an
unknown file and make sure it does not get the error for
object_error::parse_failed. And also update the unit tests.
llvm-svn: 258883
Summary:
This patch is provided in preparation for removing autoconf on 1/26. The proposal to remove autoconf on 1/26 was discussed on the llvm-dev thread here: http://lists.llvm.org/pipermail/llvm-dev/2016-January/093875.html
"I felt a great disturbance in the [build system], as if millions of [makefiles] suddenly cried out in terror and were suddenly silenced. I fear something [amazing] has happened."
- Obi Wan Kenobi
Reviewers: chandlerc, grosbach, bob.wilson, tstellarAMD, echristo, whitequark
Subscribers: chfast, simoncook, emaste, jholewinski, tberghammer, jfb, danalbert, srhines, arsenm, dschuff, jyknight, dsanders, joker.eph, llvm-commits
Differential Revision: http://reviews.llvm.org/D16471
llvm-svn: 258861
Summary:
Update ObjectTransformLayer::addObjectSet to take the object set by
value rather than reference and pass it to the base layer with move
semantics rather than copy, to match r258185's changes to
ObjectLinkingLayer.
Update the unit test to verify that ObjectTransformLayer's signature stays
in sync with ObjectLinkingLayer's.
Reviewers: lhames
Subscribers: llvm-commits
Differential Revision: http://reviews.llvm.org/D16414
llvm-svn: 258630
Using an array instead of ArrayRef would allow type inference, but
(short of using C99) one would still need to write
typedef uint16_t VT[];
LE.write(VT{0x1234, 0x5678});
llvm-svn: 258535
they're needed.
Prior to this patch objects were loaded (via RuntimeDyld::loadObject) when they
were added to the ObjectLinkingLayer, but were not relocated and finalized until
a symbol address was requested. In the interim, another object could be loaded
and finalized with the same memory manager, causing relocation/finalization of
the first object to fail (as the first finalization call may have marked the
allocated memory for the first object read-only).
By deferring the loadObject call (and subsequent memory allocations) until an
object file is needed we can avoid prematurely finalizing memory.
llvm-svn: 258185
Previously these were Darwin-only. Since the switch to direct binary emission
of stubs, trampolines and resolver blocks, these should work on other *nix
platforms too.
These tests can be enabled on Windows once known issues with ORC's handling of
Windows symbol mangling (see e.g. https://llvm.org/PR25940) have been fixed.
llvm-svn: 258031
Summary:
Simplify the memory management of mock IR in test AlterInvokeBundles.
Reviewers: dblaikie
Subscribers: llvm-commits
Differential Revision: http://reviews.llvm.org/D16211
llvm-svn: 257892
Summary:
Before this the Verifier didn't complain if the GlobalVariable
referenced from a DIGlobalVariable was not in fact in the correct
module (it would crash while writing bitcode though). Fix this by
always checking parantage of GlobalValues while walking constant
expressions and changing the DIGlobalVariable visitor to also
visit the constant it contains.
Reviewers: rafael
Differential Revision: http://reviews.llvm.org/D16059
llvm-svn: 257825
Summary:
We already have the inverse verification that we only use globals
that are defined in this module. This essentially catches the
same mistake, but when verifying the module that contains the
definition.
Reviewers: rafael
Differential Revision: http://reviews.llvm.org/D15272
llvm-svn: 257823
Summary:
The overloads of CallInst::Create and InvokeInst::Create that are used to
adjust operand bundles purport to create a new instruction "identical in
every way except [for] the operand bundles", so copy the DebugLoc along
with everything else.
Reviewers: sanjoy, majnemer
Subscribers: majnemer, dblaikie, llvm-commits
Differential Revision: http://reviews.llvm.org/D16157
llvm-svn: 257745
Summary:
The problem here is that an enum class can not be implicitly converted to an
integer. That assumption snuck back into PointerIntPair. This commit fixes the
issue and more importantly adds some unittests to make sure that we do not break
this again.
rdar://23594806
Reviewers: gribozavr
Subscribers: llvm-commits
Differential Revision: http://reviews.llvm.org/D16131
llvm-svn: 257574
Summary: Add SaturatingMultiplyAdd convenience function template since A + (X * Y) comes up frequently when doing weighted arithmetic.
Reviewers: davidxl, silvas
Subscribers: llvm-commits
Differential Revision: http://reviews.llvm.org/D15385
llvm-svn: 257532
This patch adds utilities to ORC for managing a remote JIT target. It consists
of:
1. A very primitive RPC system for making calls over a byte-stream. See
RPCChannel.h, RPCUtils.h.
2. An RPC API defined in the above system for managing memory, looking up
symbols, creating stubs, etc. on a remote target. See OrcRemoteTargetRPCAPI.h.
3. An interface for creating high-level JIT components (memory managers,
callback managers, stub managers, etc.) that operate over the RPC API. See
OrcRemoteTargetClient.h.
4. A helper class for building servers that can handle the RPC calls. See
OrcRemoteTargetServer.h.
The system is designed to work neatly with the existing ORC components and
functionality. In particular, the ORC callback API (and consequently the
CompileOnDemandLayer) is supported, enabling lazy compilation of remote code.
Assuming this doesn't trigger any builder failures, a follow-up patch will be
committed which tests these utilities by using them to replace LLI's existing
remote-JITing demo code.
llvm-svn: 257305
RuntimeDyld::MemoryManager.
The RuntimeDyld::MemoryManager::reserveAllocationSpace method is called when
object files are loaded, and gives clients a chance to pre-allocate memory for
all segments. Previously only the size of each segment (code, ro-data, rw-data)
was supplied but not the alignment. This hasn't caused any problems so far, as
most clients allocate via the MemoryBlock interface which returns page-aligned
blocks. Adding alignment arguments enables finer grained allocation while still
satisfying alignment restrictions.
llvm-svn: 257294
llvm\unittests\ExecutionEngine\Orc\ObjectLinkingLayerTest.cpp(115) : error C2327: 'llvm::OrcExecutionTest::TM' : is not a type name, static, or enumerator
llvm\unittests\ExecutionEngine\Orc\ObjectLinkingLayerTest.cpp(115) : error C2065: 'TM' : undeclared identifier
FYI, "this->TM" was valid even before moving class SectionMemoryManagerWrapper.
llvm-svn: 257290
type.
This makes it easy and safe to use a set of flags as one elmenet of
a tagged union with pointers. There is quite a bit of code that has
historically done this by casting arbitrary integers to "pointers" and
assuming that this was safe and reliable. It is neither, and has started
to rear its head by triggering safety asserts in various abstractions
like PointerLikeTypeTraits when the integers chosen are invariably poor
choices for *some* platform and *some* situation. Not to mention the
(hopefully unlikely) prospect of one of these integers actually getting
allocated!
With this, it will be straightforward to build type safe abstractions
like this without being error prone. The abstraction itself is also
remarkably simple thanks to the implicit conversion.
This use case and pattern was also independently created by the folks
working on Swift, and they're going to incrementally add any missing
functionality they find.
Differential Revision: http://reviews.llvm.org/D15844
llvm-svn: 257284
This is a much more general and powerful form of PointerUnion. It
provides a reasonably complete sum type (from type theory) for
pointer-like types. It has several significant advantages over the
existing PointerUnion infrastructure:
1) It allows more than two pointer types to participate without awkward
nesting structures.
2) It directly exposes the tag so that it is convenient to write
switches over the possible members.
3) It can re-use the same type for multiple tag values, something that
has been worked around by either abusing PointerIntPair or defining
nonce types and doing unsafe pointer casting.
4) It supports customization of the PointerLikeTypeTraits used for
specific member types. This means it could (in theory) be used even
with types that are over-aligned on allocation to expose larger
numbers of bits to the tag.
All in all, I think it is at least complimentary to the existing
infrastructure, and a strict improvement for some use cases.
Differential Revision: http://reviews.llvm.org/D15843
llvm-svn: 257282
managers.
Prior to this patch, recursive finalization (where finalization of one
RuntimeDyld instance triggers finalization of another instance on which the
first depends) could trigger memory access failures: When the inner (dependent)
RuntimeDyld instance and its memory manager are finalized, memory allocated
(but not yet relocated) by the outer instance is locked, and relocation in the
outer instance fails with a memory access error.
This patch adds a latch to the RuntimeDyld::MemoryManager base class that is
checked by a new method: RuntimeDyld::finalizeWithMemoryManagerLocking, ensuring
that shared memory managers are only finalized by the outermost RuntimeDyld
instance.
This allows ORC clients to supply the same memory manager to multiple calls to
addModuleSet. In particular it enables the use of user-supplied memory managers
with the CompileOnDemandLayer which must reuse the supplied memory manager for
each function that is lazily compiled.
llvm-svn: 257263
Done in InstrProfWriter to eliminate the need for client
code to do the sorting. The operation is done once and reused
many times so it is more efficient. Update unit test to remove
sorting. Also update expected output of affected tests.
llvm-svn: 257145
For a new record with weight != 1, only edge profiling
counters are scaled, VP data is not properly scaled.
This patch refactors the code and fixes the problem.
Also added sort by count interface (for follow up patch).
llvm-svn: 257143
Fix PR24852 (crash with -debug -instcombine)
Patch by Than McIntosh <thanm@google.com>
Summary:
Add guards to the asm writer to prevent crashing
when dumping an instruction that has no basic
block.
Differential Revision: http://reviews.llvm.org/D15798
From: Than McIntosh <thanm@google.com>
llvm-svn: 257094
This adds a unittest for the support added in r256648 to add
a flag that can be used to prevent RAUW on temporary metadata
used as a map key.
llvm-svn: 256938
...and mark it as merely an input_iterator rather than a forward_iterator,
since it is destructive. And then rewrite == to take advantage of that.
Patch by Alex Denisov!
llvm-svn: 256913
Summary:
There are a number of files in the tree which have been accidentally checked in with DOS line endings. Convert these to native line endings.
There are also a few files which have DOS line endings on purpose, and I have set the svn:eol-style property to 'CRLF' on those.
Reviewers: joerg, aaron.ballman
Subscribers: aaron.ballman, sanjoy, dsanders, llvm-commits
Differential Revision: http://reviews.llvm.org/D15848
llvm-svn: 256707
This is part of the effort/prepration to reduce the size
instr-pgo (object, binary, memory footprint, and raw data).
The functionality is currently off by default and not yet
used by any clients.
llvm-svn: 256667
This is necessary to use them as part of pointer traits and is generally
useful. I've added unit test coverage to isolate and ensure this works
correctly.
I'll watch the build bots to try to see if any compilers can't tolerate
this bit of magic (and much credit goes to Richard Smith for coming up
with this magical production!) but give a shout if you see issues.
llvm-svn: 256553
Previously, the code enforced non-decreasing alignment of each trailing
type. However, it's easy enough to allow for realignment as needed, and
thus avoid the developer having to think about the possiblilities for
alignment requirements on all architectures.
(E.g. on Linux/x86, a struct with an int64 member is 4-byte aligned,
while on other 32-bit archs -- and even with other OSes on x86 -- it has
8-byte alignment. This sort of thing is irritating to have to manually
deal with.)
llvm-svn: 256533
We didn't actually statically check this, and so it worked 25% of the
time for me. =/ Really sorry it took so long to fix, I shouldn't leave
the commit log editor window open without saving and landing the commit.
=[
llvm-svn: 256528
MSC18 Debug didn't merge them.
FIXME: I tweaked just to appease a builder. Almost string literals should be addressed identically there.
llvm-svn: 256459
Summary:
We need to actually remove the use of the personality function,
otherwise we can run into trouble if we want to e.g. delete
the personality function because ther's no way to get rid of
its uses. Do this by resetting to ConstantPointerNull value
that the operands are set to when first allocated.
Reviewers: vsk, dexonsmith
Subscribers: llvm-commits
Differential Revision: http://reviews.llvm.org/D15752
llvm-svn: 256345
Creator and lookup interfaces are added to this symtab class.
The new interfaces will be used by InstrProf Readers and writer.
A unit test is also added for the new APIs.
llvm-svn: 256092
Remove all checks that required main thread to run faster than tasks in
ThreadPool, and yields which are now unnecessary. This should fix some
bot failures.
llvm-svn: 256056
- Automatic alignment of the base type for the alignment requirements
of the trailing types.
- Support for an arbitrary numbers of trailing types, instead of only
1 or 2, by using a variadic template implementation.
Upcoming commits to clang will take advantage of both of these features.
Differential Revision: http://reviews.llvm.org/D12439
llvm-svn: 256054
Type specific declarations have been moved to Type.h and error handling
routines have been moved to ErrorHandling.h. Both are included in Core.h
so nothing should change for projects directly including the headers,
but transitive dependencies may be affected.
llvm-svn: 255965
The current BranchProbability::normalizeProbabilities() forbids known and
unknown probabilities to coexist in the list. This was once used to help
capture probability exceptions but has caused some reported build
failures (https://llvm.org/bugs/show_bug.cgi?id=25838).
This patch removes this restriction by evenly distributing the complement
of the sum of all known probabilities to unknown ones. We could still
treat this as an abnormal behavior, but it is better to emit warnings in
our future profile validator.
Differential revision: http://reviews.llvm.org/D15548
llvm-svn: 255934
Summary: Surface counter overflow when merging profile data. Merging still occurs on overflow but counts saturate to the maximum representable value. Overflow is reported to the user.
Reviewers: davidxl, dnovillo, silvas
Subscribers: llvm-commits
Differential Revision: http://reviews.llvm.org/D15547
llvm-svn: 255825
Summary:
This change adds support for specifying a weight when merging profile data with the llvm-profdata tool.
Weights are specified by using the --weighted-input=<weight>,<filename> option. Input files not specified
with this option (normal positional list after options) are given a default weight of 1.
Adding support for arbitrary weighting of input profile data allows for relative importance to be placed on the
input data from multiple training runs.
Both sampled and instrumented profiles are supported.
Reviewers: davidxl, dnovillo, bogner, silvas
Subscribers: silvas, davidxl, llvm-commits
Differential Revision: http://reviews.llvm.org/D15306
llvm-svn: 255659
This is a very simple implementation of a thread pool using C++11
thread. It accepts any std::function<void()> for asynchronous
execution. Individual task can be synchronize using the returned
future, or the client can block on the full queue completion.
In case LLVM is configured with Threading disabled, it falls back
to sequential execution using std::async with launch:deferred.
This is intended to support parallelism for ThinLTO processing in
linker plugin, but is generic enough for any other uses.
This is a recommit of r255444 ; trying to workaround a bug in the
MSVC 2013 standard library. I think I was hit by:
http://connect.microsoft.com/VisualStudio/feedbackdetail/view/791185/std-packaged-task-t-where-t-is-void-or-a-reference-class-are-not-movable
Recommit of r255589, trying to please g++ as well.
Differential Revision: http://reviews.llvm.org/D15464
From: mehdi_amini <mehdi_amini@91177308-0d34-0410-b5e6-96231b3b80d8>
llvm-svn: 255593
This is a very simple implementation of a thread pool using C++11
thread. It accepts any std::function<void()> for asynchronous
execution. Individual task can be synchronize using the returned
future, or the client can block on the full queue completion.
In case LLVM is configured with Threading disabled, it falls back
to sequential execution using std::async with launch:deferred.
This is intended to support parallelism for ThinLTO processing in
linker plugin, but is generic enough for any other uses.
This is a recommit of r255444 ; trying to workaround a bug in the
MSVC 2013 standard library. I think I was hit by:
http://connect.microsoft.com/VisualStudio/feedbackdetail/view/791185/std-packaged-task-t-where-t-is-void-or-a-reference-class-are-not-movable
Differential Revision: http://reviews.llvm.org/D15464
From: Mehdi Amini <mehdi.amini@apple.com>
llvm-svn: 255589
This patch converts code that has access to a LLVMContext to not take a
diagnostic handler.
This has a few advantages
* It is easier to use a consistent diagnostic handler in a single program.
* Less clutter since we are not passing a handler around.
It does make it a bit awkward to implement some C APIs that return a
diagnostic string. I will propose new versions of these APIs and
deprecate the current ones.
llvm-svn: 255571
This patch adds optional fast-math-flags (the same that apply to fmul/fadd/fsub/fdiv/frem/fcmp)
to call instructions in IR. Follow-up patches would use these flags in LibCallSimplifier, add
support to clang, and extend FMF to the DAG for calls.
Motivating example:
%y = fmul fast float %x, %x
%z = tail call float @sqrtf(float %y)
We'd like to be able to optimize sqrt(x*x) into fabs(x). We do this today using a function-wide
attribute for unsafe-math, but we really want to trigger on the instructions themselves:
%z = tail call fast float @sqrtf(float %y)
because in an LTO build it's possible that calls with fast semantics have been inlined into a
function with non-fast semantics.
The code changes and tests are based on the recent commits that added "notail":
http://reviews.llvm.org/rL252368
and added FMF to fcmp:
http://reviews.llvm.org/rL241901
Differential Revision: http://reviews.llvm.org/D14707
llvm-svn: 255555
Make sure to check that the destination type is sized.
A check was present but was incorrectly checking the source type
instead.
Patch by Amaury SECHET!
Differential Revision: http://reviews.llvm.org/D15264
llvm-svn: 255536
This is a very simple implementation of a thread pool using C++11
thread. It accepts any std::function<void()> for asynchronous
execution. Individual task can be synchronize using the returned
future, or the client can block on the full queue completion.
In case LLVM is configured with Threading disabled, it falls back
to sequential execution using std::async with launch:deferred.
This is intended to support parallelism for ThinLTO processing in
linker plugin, but is generic enough for any other uses.
Differential Revision: http://reviews.llvm.org/D15464
From: Mehdi Amini <mehdi.amini@apple.com>
llvm-svn: 255444
Summary:
Adds support for in-memory round-trip of sample profile data along with basic
round trip unit tests. This will also make it easier to include unit tests for
future changes to sample profiling.
Reviewers: davidxl, dnovillo, silvas
Subscribers: llvm-commits
Differential Revision: http://reviews.llvm.org/D15211
llvm-svn: 255264
Introduced DIMacro and DIMacroFile debug info metadata in the LLVM IR to support macros.
Differential Revision: http://reviews.llvm.org/D14687
llvm-svn: 255245
The ConstantDataArray::getFP(LLVMContext &, ArrayRef<uint16_t>)
overload has had a typo in it since it was written, where it will
create a Vector instead of an Array. This obviously doesn't work at
all, but it turns out that until r254991 there weren't actually any
callers of this overload. Fix the typo and add some test coverage.
llvm-svn: 255157
Summary:
Improve SaturatingAdd()/SaturatingMultiply() to use bool * to optionally return overflow result.
This should make it clearer that the value is returned at callsites and reduces the size of the implementation.
Reviewers: davidxl, silvas
Subscribers: llvm-commits
Differential Revision: http://reviews.llvm.org/D15219
llvm-svn: 255128
Currently, vectors of halfs end up as ConstantVectors, but there isn't
a good reason they can't be ConstantDataVectors. This should save some
memory.
llvm-svn: 254991
This is needed to support linking of module-level metadata as a
postpass after function importing, where we will be leaving temporary
metadata on imported instructions until the postpass metadata import.
Also added unittest. Split from D14838.
llvm-svn: 254914
Before this patch the diagnostic handler was optional. If it was not
passed, the one in the LLVMContext was used.
That is probably not a pattern we want to follow. If each area has an
optional callback, there is a sea of callbacks and it is hard to follow
which one is called.
Doing this also found cases where the callback is a nice addition, like
testing that no errors or warnings are reported.
The other option is to always use the diagnostic handler in the
LLVMContext. That has a few problems
* To implement the C API we would have to set the diag handler and then
set it back to the original value.
* Code that creates the context might be far away from code that wants
the diagnostics.
I do have a patch that implements the second option and will send that as
an RFC.
llvm-svn: 254777
This class is turning into a useful interface, rather than an implementation
detail, so I'm dropping the 'Base' suffix.
No functional change.
llvm-svn: 254693
This change adds support for an optional weight when merging profile data with the llvm-profdata tool.
Weights are specified by adding an option ':<weight>' suffix to the input file names.
Adding support for arbitrary weighting of input profile data allows for relative importance to be placed on the
input data from multiple training runs.
Both sampled and instrumented profiles are supported.
Reviewers: dnovillo, bogner, davidxl
Subscribers: llvm-commits
Differential Revision: http://reviews.llvm.org/D14547
llvm-svn: 254669
Summary: This changes overflow handling during instrumentation profile merge. Rathar than throwing away records that would result in counter overflow, merged counts are instead clamped to the maximum representable value. A warning about counter overflow is still surfaced to the user as before.
Reviewers: dnovillo, davidxl, silvas
Subscribers: llvm-commits
Differential Revision: http://reviews.llvm.org/D14893
llvm-svn: 254525
By including the module name in the error message.
This makes the error message much more useful and
saves a trip to the debugger.
Reviewers: dexonsmith
Subscribers: dexonsmith, llvm-commits
Differential Revision: http://reviews.llvm.org/D14473
llvm-svn: 254437
Raw profile writer needs to write all data of one kind in one continuous block,
so the buffer needs to be pre-allocated and passed to the writer method in
pieces for function profile data. The change adds the support for raw value data
writing.
llvm-svn: 254219
This is one of the many steps to commonize value profiling support between profile
runtime and compiler/llvm tools.
After this change, profiler runtime now can share the same C APIs to do VP
serialization/deseriazation with LLVM host tools (and produces value data
in identical format between indexed and raw profile).
It is not yet enabled in profiler runtime yet.
Also added a unit test case to test runtime profile data serialization/deserialization
interfaces implemented using common closure code.
llvm-svn: 254110
Summary: Adds the ability for callers to detect when saturation occurred on the result of saturating addition/multiplication.
Reviewers: davidxl, silvas, rsmith
Subscribers: llvm-commits
Differential Revision: http://reviews.llvm.org/D14931
llvm-svn: 253921
Summary:
This change fixes the SaturatingMultiply<T>() function template to not cause undefined behavior with T=uint16_t.
Thanks to Richard Smith's contribution, it also no longer requires an integer division.
Patch by Richard Smith.
Reviewers: silvas, davidxl
Subscribers: rsmith, davidxl, llvm-commits
Differential Revision: http://reviews.llvm.org/D14845
llvm-svn: 253870
Summary:
This follows D14577 to treat ARMv6-J as an alias for ARMv6,
instead of an architecture in its own right.
The functional change is that the default CPU when targeting ARMv6-J
changes from arm1136j-s to arm1136jf-s, which is currently used as
the default CPU for ARMv6; both are, in fact, ARMv6-J CPUs.
The J-bit (Jazelle support) is irrelevant to LLVM, and it doesn't
affect code generation, attributes, optimizations, or anything else,
apart from selecting the default CPU.
Reviewers: rengolin, logan, compnerd
Subscribers: aemerson, llvm-commits, rengolin
Differential Revision: http://reviews.llvm.org/D14755
llvm-svn: 253675
It caused link errors of the form:
InstrProfiling.c:(.text.__llvm_profile_instrument_target+0x1c0): undefined reference to `__sync_fetch_and_add_8'
We had a network outage at the time of the commit so the first build to show a
problem is http://lab.llvm.org:8011/builders/clang-cmake-mips/builds/10827
llvm-svn: 253656
Ubsan detected undefined behavior in the MathExtras SaturatingMultiply test.
This change disables the test while it is being investigated.
llvm-svn: 253539
Summary:
This change adds MathExtras helper functions for handling unsigned, saturating addition and multiplication. It also updates the instrumentation and sample profile merge implementations to use them.
Reviewers: dnovillo, bogner, davidxl
Subscribers: davidxl, llvm-commits
Differential Revision: http://reviews.llvm.org/D14720
llvm-svn: 253497
Summary:
This change adds MathExtras helper functions for handling unsigned, saturating addition and multiplication. It also updates the instrumentation and sample profile merge implementations to use them.
No functional changes.
Reviewers: dnovillo, bogner, davidxl
Subscribers: davidxl, llvm-commits
Differential Revision: http://reviews.llvm.org/D14720
llvm-svn: 253412
Summary:
This patch changes the behavior of path::system_temp_directory() on Windows to be closer to GetTempPath Windows API call. Enforces path separator to be the native one, makes path absolute, etc. GetTempPath is not used directly because of limitations/implementation bugs on Windows 7.
Windows specific unit tests are added. Most of them runs in separated process with modified environment variables.
This change fixes FileSystemTest.CreateDir unittest that had been failing when run from Unix-like shell on Windows (Unix-like path separator (/) used in env variables).
Reviewers: chapuni, rafael, aaron.ballman
Subscribers: rafael, llvm-commits
Differential Revision: http://reviews.llvm.org/D14231
llvm-svn: 253345
Useful utility function; this wasn't too hard to do before, but also wasn't
obviously discoverable. Make it explicit. Reviewed offline by Michael
Gottesman.
llvm-svn: 253254
Summary:
* ARMv6KZ is the "canonical" name, given in the ARMARM
* ARMv6Z is an "official abbreviation" for it, mentioned in the ARMARM
* ARMv6ZK is a popular misspelling, which we should support as an alias.
The patch corrects the handling of the names.
Functional changes:
* ARMv6Z no longer treated as an architecture in its own right
* ARMv6ZK renamed to ARMv6KZ, accepting ARMv6ZK as an alias
* arm1176jz-s and arm1176jzf-s recognized as ARMv6ZK, instead of ARMv6K
* default ARMv6K CPU changed to arm1176j-s
Reviewers: rengolin, logan, compnerd
Subscribers: aemerson, llvm-commits, rengolin
Differential Revision: http://reviews.llvm.org/D14568
llvm-svn: 253206
Re-implement `ilist_node::getNextNode()` and `getPrevNode()` without
relying on the sentinel having a "next" pointer. Instead, get access to
the owning list and compare against the `begin()` and `end()` iterators.
This only works when the node *can* get access to the owning list. The
new support is in `ilist_node_with_parent<>`, and any class `Ty`
inheriting from `ilist_node<NodeTy>` that wants `getNextNode()` and/or
`getPrevNode()` should inherit from
`ilist_node_with_parent<NodeTy, ParentTy>` instead. The requirements:
- `NodeTy` must have a `getParent()` function that returns the parent.
- `ParentTy` must have a `getSublistAccess()` static that, given a(n
ignored) `NodeTy*` (to determine which list), returns a member field
pointer to the appropriate `ilist<>`.
This isn't the cleanest way to get access to the owning list, but it
leverages the API already used in the IR hierarchy (see, e.g.,
`Instruction::getSublistAccess()`).
If anyone feels like ripping out the calls to `getNextNode()` and
`getPrevNode()` and replacing with direct iterator logic, they can also
remove the access function, etc., but as an incremental step, I'm
maintaining the API where it's currently used in tree.
If these requirements are *not* met, call sites with access to the ilist
can call `iplist<NodeTy>::getNextNode(NodeTy*)` directly, as in
ilistTest.cpp.
Why rewrite this?
The old code was broken, calling `getNext()` on a sentinel that possibly
didn't have a "next" pointer at all! The new code avoids that
particular flavour of UB (see the commit message for r252538 for more
details about the "lucky" memory layout that made this function so
interesting).
There's still some UB here: the end iterator gets downcast to `NodeTy*`,
even when it's a sentinel (which is typically
`ilist_half_node<NodeTy*>`). I'll tackle that in follow-up commits.
See this llvm-dev thread for more details:
http://lists.llvm.org/pipermail/llvm-dev/2015-October/091115.html
What's the danger?
There might be some code that relies on `getNextNode()` or
`getPrevNode()` *never* returning `nullptr` -- i.e., that relies on them
being broken when the sentinel is an `ilist_half_node<NodeTy>`. I tried
to root out those cases with the audits I did leading up to r252380, but
it's possible I missed one or two. I hope not.
(If (1) you have out-of-tree code, (2) you've reverted r252380
temporarily, and (3) you get some weird crashes with this commit, then I
recommend un-reverting r252380 and auditing the compile errors looking
for "strange" implicit conversions.)
llvm-svn: 252694
- Make indexed value profile data more compact by peeling out
the per-site value count field into its own smaller sized array.
- Introduced formal data structure definitions to specify value
profile data layout in indexed format. Previously the layout
of the data is only assumed in the client code (scattered in
three different places : size computation, EmitData, and ReadData
- The new data structure serves as a central place for layout documentation.
- Add interfaces to force BE output for value profile data (testing purpose)
- Add byte swap unit tests
Differential Revision: http://reviews.llvm.org/D14401
llvm-svn: 252563
Summary:
In general GetTempDir follows the same logic as the replaced code: checks env variables TMP, TEMP, USERPROFILE in order. However, it also perform other checks like making separators native (\), making the path absolute, etc.
This change fixes FileSystemTest.CreateDir unittest that had been failing when run from Unix-like shell on Windows (Unix-like path separator (/) used in env variables).
Reviewers: chapuni, rafael, aaron.ballman
Subscribers: rafael, llvm-commits
Differential Revision: http://reviews.llvm.org/D14231
llvm-svn: 252366
Previously, subprograms contained a metadata reference to the function they
described. Because most clients need to get or set a subprogram for a given
function rather than the other way around, this created unneeded inefficiency.
For example, many passes needed to call the function llvm::makeSubprogramMap()
to build a mapping from functions to subprograms, and the IR linker needed to
fix up function references in a way that caused quadratic complexity in the IR
linking phase of LTO.
This change reverses the direction of the edge by storing the subprogram as
function-level metadata and removing DISubprogram's function field.
Since this is an IR change, a bitcode upgrade has been provided.
Fixes PR23367. An upgrade script for textual IR for out-of-tree clients is
attached to the PR.
Differential Revision: http://reviews.llvm.org/D14265
llvm-svn: 252219
Summary: On Windows we have to take UTF16 encoded env vars and convert them to UTF8. This patch fixes CopyEnvironment helper function used by process unit tests.
Reviewers: yaron.keren
Subscribers: yaron.keren, llvm-commits
Differential Revision: http://reviews.llvm.org/D14278
llvm-svn: 252039
Bypassing LLVM for this has a number of benefits:
1) Laziness support becomes asm-syntax agnostic (previously lazy jitting didn't
work on Windows as the resolver block was in Darwin asm).
2) For cross-process JITs, it allows resolver blocks and trampolines to be
emitted directly in the target process, reducing cross process traffic.
3) It should be marginally faster.
llvm-svn: 251933
Summary:
The new function sys::path::user_cache_directory tries to discover
a directory suitable for cache storage for current system user.
On Windows and Darwin it returns a path to system-specific user cache directory.
On Linux it follows XDG Base Directory Specification, what is:
- use non-empty $XDG_CACHE_HOME env var,
- use $HOME/.cache.
Reviewers: chapuni, aaron.ballman, rafael
Subscribers: rafael, aaron.ballman, llvm-commits
Differential Revision: http://reviews.llvm.org/D13801
llvm-svn: 251784
1. Added a set of public interfaces in InstrProfRecord
class to access (read/write) value profile data.
2. Changed IndexedProfile reader and writer code to
use the newly defined interfaces and hide implementation
details.
3. Added a couple of unittests for value profiling:
- Test new interfaces to get and set value profile data
- Test value profile data merging with various scenarios.
No functional change is expected. The new interfaces will also
make it possible to change on-disk format of value prof data
to be more compact (to be submitted).
llvm-svn: 251771
This complements CopyConstructorNotSmallTest. If we are testing the copy
constructor in such a way, we should also probably test assignment in the same
way.
llvm-svn: 251736
GNU tools require elfiamcu to take up the entire OS field, so, e.g.
i?86-*-linux-elfiamcu is not considered a legal triple.
Make us compatible.
Differential Revision: http://reviews.llvm.org/D14081
llvm-svn: 251390
This adds support for the i?86-*-elfiamcu triple, which indicates the IAMCU psABI is used.
Differential Revision: http://reviews.llvm.org/D13977
llvm-svn: 251222
In this mode it just tries to tail merge the strings without imposing any other
format constrains. It will not, for example, add a null byte between them.
Also add support for keeping a tentative size and offset if we decide to
not optimize after all.
This will be used shortly in lld for merging SHF_STRINGS sections.
llvm-svn: 251153
Summary: This will be used in a future change to ScalarEvolution.
Reviewers: hfinkel, reames, nlewycky
Subscribers: llvm-commits
Differential Revision: http://reviews.llvm.org/D13612
llvm-svn: 250975
"external" AA wrapper pass.
This is a generic hook that can be used to thread custom code into the
primary AAResultsWrapperPass for the legacy pass manager in order to
allow it to merge external AA results into the AA results it is
building. It does this by threading in a raw callback and so it is
*very* powerful and should serve almost any use case I have come up with
for extending the set of alias analyses used. The only thing not well
supported here is using a *different order* of alias analyses. That form
of extension *is* supportable with the new pass manager, and I can make
the callback structure here more elaborate to support it in the legacy
pass manager if this is a critical use case that people are already
depending on, but the only use cases I have heard of thus far should be
reasonably satisfied by this simpler extension mechanism.
It is hard to test this using normal facilities (the built-in AAs don't
use this for obvious reasons) so I've written a fairly extensive set of
custom passes in the alias analysis unit test that should be an
excellent test case because it models the out-of-tree users: it adds
a totally custom AA to the system. This should also serve as
a reasonably good example and guide for out-of-tree users to follow in
order to rig up their existing alias analyses.
No support in opt for commandline control is provided here however. I'm
really unhappy with the kind of contortions that would be required to
support that. It would fully re-introduce the analysis group
self-recursion kind of patterns. =/
I've heard from out-of-tree users that this will unblock their use cases
with extending AAs on top of the new infrastructure and let us retain
the new analysis-group-free-world.
Differential Revision: http://reviews.llvm.org/D13418
llvm-svn: 250894
Summary: This patch replaces usage of deprecated SHGetFolderPathW with SHGetKnownFolderPath. The usage of SHGetKnownFolderPath is wrapped to allow queries for other "known" folders in the near future.
Reviewers: aaron.ballman, gbedwell
Subscribers: chapuni, llvm-commits
Differential Revision: http://reviews.llvm.org/D13753
llvm-svn: 250501
This patch adds the underlying infrastructure for an AVR backend to be included into LLVM. It is the first of a series of patches aimed at moving the out-of-tree AVR backend into the tree.
It consists of adding a new`Triple` target 'avr'.
llvm-svn: 250492
With r250345 and r250343, we start to observe the following failure
when bootstrap clang with lto and pgo:
PHI node entries do not match predecessors!
%.sroa.029.3.i = phi %"class.llvm::SDNode.13298"* [ null, %30953 ], [ null, %31017 ], [ null, %30998 ], [ null, %_ZN4llvm8dyn_castINS_14ConstantSDNodeENS_7SDValueEEENS_10cast_rettyIT_T0_E8ret_typeERS5_.exit.i.1804 ], [ null, %30975 ], [ null, %30991 ], [ null, %_ZNK4llvm3EVT13getScalarTypeEv.exit.i.1812 ], [ %..sroa.029.0.i, %_ZN4llvm11SmallVectorIiLj8EED1Ev.exit.i.1826 ], !dbg !451895
label %30998
label %_ZNK4llvm3EVTeqES0_.exit19.thread.i
LLVM ERROR: Broken function found, compilation aborted!
I will re-commit this if the bot does not recover.
llvm-svn: 250366
Currently in JumpThreading pass, the branch weight metadata is not updated after CFG modification. Consider the jump threading on PredBB, BB, and SuccBB. After jump threading, the weight on BB->SuccBB should be adjusted as some of it is contributed by the edge PredBB->BB, which doesn't exist anymore. This patch tries to update the edge weight in metadata on BB->SuccBB by scaling it by 1 - Freq(PredBB->BB) / Freq(BB->SuccBB).
This is the third attempt to submit this patch, while the first two led to failures in some FDO tests. After investigation, it is the edge weight normalization that caused those failures. In this patch the edge weight normalization is fixed so that there is no zero weight in the output and the sum of all weights can fit in 32-bit integer. Several unit tests are added.
Differential revision: http://reviews.llvm.org/D10979
llvm-svn: 250345
On Windows, fs::rename() could fail is another process was reading the
file at the same time using fs::openFileForRead(). In most cases the user
wouldn't notice as fs::rename() will continue to retry for 2000ms. Typically
this is enough for the read to complete and a retry to succeed, but if the
disk is being it too hard then the response time might be longer than the
retry time and the rename would fail with a permission error.
Add FILE_SHARE_DELETE to the sharing flags for CreateFileW() in
fs::openFileForRead() and try ReplaceFileW() prior to MoveFileExW()
in fs::rename().
Based on an initial patch by Edd Dawson!
Differential Revision: http://reviews.llvm.org/D13647
llvm-svn: 250046
Summary:
As per Duncan's review for D12536, I extracted the sub-byte bit aligned
reading and writing code into lib/Support, and generalized it. Added calls from
BackpatchWord. Also added unittests.
Reviewers: dexonsmith
Subscribers: llvm-commits
Differential Revision: http://reviews.llvm.org/D13189
llvm-svn: 248897
Add support to the indexed instrprof reader and writer for the format
that will be used for value profiling.
Patch by Betul Buyukkurt, with minor modifications.
llvm-svn: 248833
HHVM calling convention, hhvmcc, is used by HHVM JIT for
functions in translated cache. We currently support LLVM back end to
generate code for X86-64 and may support other architectures in the
future.
In HHVM calling convention any GP register could be used to pass and
return values, with the exception of R12 which is reserved for
thread-local area and is callee-saved. Other than R12, we always
pass RBX and RBP as args, which are our virtual machine's stack pointer
and frame pointer respectively.
When we enter translation cache via hhvmcc function, we expect
the stack to be aligned at 16 bytes, i.e. skewed by 8 bytes as opposed
to standard ABI alignment. This affects stack object alignment and stack
adjustments for function calls.
One extra calling convention, hhvm_ccc, is used to call C++ helpers from
HHVM's translation cache. It is almost identical to standard C calling
convention with an exception of first argument which is passed in RBP
(before we use RDI, RSI, etc.)
Differential Revision: http://reviews.llvm.org/D12681
llvm-svn: 248832
BranchProbability now is represented by its numerator and denominator in uint32_t type. This patch changes this representation into a fixed point that is represented by the numerator in uint32_t type and a constant denominator 1<<31. This is quite similar to the representation of BlockMass in BlockFrequencyInfoImpl.h. There are several pros and cons of this change:
Pros:
1. It uses only a half space of the current one.
2. Some operations are much faster like plus, subtraction, comparison, and scaling by an integer.
Cons:
1. Constructing a probability using arbitrary numerator and denominator needs additional calculations.
2. It is a little less precise than before as we use a fixed denominator. For example, 1 - 1/3 may not be exactly identical to 1 / 3 (this will lead to many BranchProbability unit test failures). This should not matter when we only use it for branch probability. If we use it like a rational value for some precise calculations we may need another construct like ValueRatio.
One important reason for this change is that we propose to store branch probabilities instead of edge weights in MachineBasicBlock. We also want clients to use probability instead of weight when adding successors to a MBB. The current BranchProbability has more space which may be a concern.
Differential revision: http://reviews.llvm.org/D12603
llvm-svn: 248633
and assert when mask is too large to apply in the small case,
previously the extra words were silently ignored.
clang-format the entire function to match current code standards.
This is a rewrite of r247972 which was reverted in r247983 due to
warning and possible UB on 32-bits hosts.
llvm-svn: 247993
Extend mask value to 64 bits before taking its complement and assert when mask is
too large to apply in the small case (previously the extra words were silently ignored).
http://reviews.llvm.org/D11890
Patch by James Touton!
llvm-svn: 247972
This was a flawed change - it just caused the getElementType call to be
deferred until later, when we really need to remove it. Now that the IR
for GlobalAliases has been updated, the root cause is addressed that way
instead and this change is no longer needed (and in fact gets in the way
- because we want to pass the pointee type directly down further).
Follow up patches to push this through GlobalValue, bitcode format, etc,
will come along soon.
This reverts commit 236160.
llvm-svn: 247585
with the StringRef::split method when used with a MaxSplit argument
other than '-1' (which nobody really does today, but which should
actually work).
The spec claimed both to split up to MaxSplit times, but also to append
<= MaxSplit strings to the vector. One of these doesn't make sense.
Given the name "MaxSplit", let's go with it being a max over how many
*splits* occur, which means the max on how many strings get appended is
MaxSplit+1. I'm not actually sure the implementation correctly provided
this logic either, as it used a really opaque loop structure.
The implementation was also playing weird games with nullptr in the data
field to try to rely on a totally opaque hidden property of the split
method that returns a pair. Nasty IMO.
Replace all of this with what is (IMO) simpler code that doesn't use the
pair returning split method, and instead just finds each separator and
appends directly. I think this is a lot easier to read, and it most
definitely matches the spec. Added some tests that exercise the corner
cases around StringRef() and StringRef("") that all now pass.
I'll start using this in code in the next commit.
llvm-svn: 247249
on StringRef. Finding and splitting on a single character is
substantially faster than doing it on even a single character StringRef
-- we immediately get to a *very* tuned memchr call this way.
Even nicer, we get to this even in a debug build, shaving 18% off the
runtime of TripleTest.Normalization, helping PR23676 some more.
llvm-svn: 247244
The purpose is to allow templated wrapper to work with either
ArrayRef or any convertible operation:
template<typename Container>
void wrapper(const Container &Arr) {
impl(makeArrayRef(Arr));
}
with Container being a std::vector, a SmallVector, or an ArrayRef.
From: Mehdi Amini <mehdi.amini@apple.com>
llvm-svn: 247214
with the new pass manager, and no longer relying on analysis groups.
This builds essentially a ground-up new AA infrastructure stack for
LLVM. The core ideas are the same that are used throughout the new pass
manager: type erased polymorphism and direct composition. The design is
as follows:
- FunctionAAResults is a type-erasing alias analysis results aggregation
interface to walk a single query across a range of results from
different alias analyses. Currently this is function-specific as we
always assume that aliasing queries are *within* a function.
- AAResultBase is a CRTP utility providing stub implementations of
various parts of the alias analysis result concept, notably in several
cases in terms of other more general parts of the interface. This can
be used to implement only a narrow part of the interface rather than
the entire interface. This isn't really ideal, this logic should be
hoisted into FunctionAAResults as currently it will cause
a significant amount of redundant work, but it faithfully models the
behavior of the prior infrastructure.
- All the alias analysis passes are ported to be wrapper passes for the
legacy PM and new-style analysis passes for the new PM with a shared
result object. In some cases (most notably CFL), this is an extremely
naive approach that we should revisit when we can specialize for the
new pass manager.
- BasicAA has been restructured to reflect that it is much more
fundamentally a function analysis because it uses dominator trees and
loop info that need to be constructed for each function.
All of the references to getting alias analysis results have been
updated to use the new aggregation interface. All the preservation and
other pass management code has been updated accordingly.
The way the FunctionAAResultsWrapperPass works is to detect the
available alias analyses when run, and add them to the results object.
This means that we should be able to continue to respect when various
passes are added to the pipeline, for example adding CFL or adding TBAA
passes should just cause their results to be available and to get folded
into this. The exception to this rule is BasicAA which really needs to
be a function pass due to using dominator trees and loop info. As
a consequence, the FunctionAAResultsWrapperPass directly depends on
BasicAA and always includes it in the aggregation.
This has significant implications for preserving analyses. Generally,
most passes shouldn't bother preserving FunctionAAResultsWrapperPass
because rebuilding the results just updates the set of known AA passes.
The exception to this rule are LoopPass instances which need to preserve
all the function analyses that the loop pass manager will end up
needing. This means preserving both BasicAAWrapperPass and the
aggregating FunctionAAResultsWrapperPass.
Now, when preserving an alias analysis, you do so by directly preserving
that analysis. This is only necessary for non-immutable-pass-provided
alias analyses though, and there are only three of interest: BasicAA,
GlobalsAA (formerly GlobalsModRef), and SCEVAA. Usually BasicAA is
preserved when needed because it (like DominatorTree and LoopInfo) is
marked as a CFG-only pass. I've expanded GlobalsAA into the preserved
set everywhere we previously were preserving all of AliasAnalysis, and
I've added SCEVAA in the intersection of that with where we preserve
SCEV itself.
One significant challenge to all of this is that the CGSCC passes were
actually using the alias analysis implementations by taking advantage of
a pretty amazing set of loop holes in the old pass manager's analysis
management code which allowed analysis groups to slide through in many
cases. Moving away from analysis groups makes this problem much more
obvious. To fix it, I've leveraged the flexibility the design of the new
PM components provides to just directly construct the relevant alias
analyses for the relevant functions in the IPO passes that need them.
This is a bit hacky, but should go away with the new pass manager, and
is already in many ways cleaner than the prior state.
Another significant challenge is that various facilities of the old
alias analysis infrastructure just don't fit any more. The most
significant of these is the alias analysis 'counter' pass. That pass
relied on the ability to snoop on AA queries at different points in the
analysis group chain. Instead, I'm planning to build printing
functionality directly into the aggregation layer. I've not included
that in this patch merely to keep it smaller.
Note that all of this needs a nearly complete rewrite of the AA
documentation. I'm planning to do that, but I'd like to make sure the
new design settles, and to flesh out a bit more of what it looks like in
the new pass manager first.
Differential Revision: http://reviews.llvm.org/D12080
llvm-svn: 247167
This makes RemoveDuplicatePHINodes more effective and fixes an assertion
failure. Triggering the assertions requires a DenseSet reallocation
so this change only contains a constructive test.
I'll explain the issue with a small example. In the following function
there's a duplicate PHI, %4 and %5 are identical. When this is found
the DenseSet in RemoveDuplicatePHINodes contains %2, %3 and %4.
define void @F() {
br label %1
; <label>:1 ; preds = %1, %0
%2 = phi i32 [ 42, %0 ], [ %4, %1 ]
%3 = phi i32 [ 42, %0 ], [ %5, %1 ]
%4 = phi i32 [ 42, %0 ], [ 23, %1 ]
%5 = phi i32 [ 42, %0 ], [ 23, %1 ]
br label %1
}
after RemoveDuplicatePHINodes runs the function looks like this. %3 has
changed and is now identical to %2, but RemoveDuplicatePHINodes never
saw this.
define void @F() {
br label %1
; <label>:1 ; preds = %1, %0
%2 = phi i32 [ 42, %0 ], [ %4, %1 ]
%3 = phi i32 [ 42, %0 ], [ %4, %1 ]
%4 = phi i32 [ 42, %0 ], [ 23, %1 ]
br label %1
}
If the DenseSet does a reallocation now it will reinsert all
keys and stumble over %3 now having a different hash value than it had
when inserted into the map for the first time. This change clears the
set whenever a PHI is deleted and starts the progress from the
beginning, allowing %3 to be deleted and avoiding inconsistent DenseSet
state. This potentially has a negative performance impact because
it rescans all PHIs, but I don't think that this ever makes a difference
in practice.
llvm-svn: 246694
We only looked through casts when one operand was a constant. We can also look through casts when both operands are non-constant, but both are in fact the same cast type. For example:
%1 = icmp ult i8 %a, %b
%2 = zext i8 %a to i32
%3 = zext i8 %b to i32
%4 = select i1 %1, i32 %2, i32 %3
llvm-svn: 246678
of its strings when expanding the string literals from the macros, and
push all of the APIs to be StringRef instead of C-string APIs.
This (remarkably) removes a very non-trivial number of strlen calls. It
even deletes code and complexity from one of the primary users -- Clang.
llvm-svn: 246374
This fixes PR24621 and matches what we do for `DILocation`. Although
the limit seems somewhat artificial, there are places in the backend
that also assume 16-bit columns, so we may as well just be consistent
about the limits.
llvm-svn: 246349
Add `Function::setSubprogram()` and `Function::getSubprogram()`,
convenience methods to forward to `setMetadata()` and `getMetadata()`,
respectively, and deal in `DISubprogram` instead of `MDNode`.
Also add a verifier check to enforce that `!dbg` attachments are always
subprograms.
Originally (when I had the llvm-dev discussion back in April) I thought
I'd store a pointer directly on `llvm::Function` for these attachments
-- we frequently have debug info, and that's much cheaper than using map
in the context if there are no other function-level attachments -- but
for now I'm just using the generic infrastructure. Let's add the extra
complexity only if this shows up in a profile.
llvm-svn: 246339
This commit extends the 'SlotMapping' structure and includes mappings for named
and numbered types in it. The LLParser is extended accordingly to fill out
those mappings at the end of module parsing.
This information is useful when we want to parse standalone constant values
at a later stage using the 'parseConstantValue' method. The constant values
can be constant expressions, which can contain references to types. In order
to parse such constant values, we have to restore the internal named and
numbered mappings for the types in LLParser, otherwise the parser will report
a parsing error. Therefore, this commit also introduces a new method called
'restoreParsingState' to LLParser, which uses the slot mappings to restore
some of its internal parsing state.
This commit is required to serialize constant value pointers in the machine
memory operands for the MIR format.
Reviewers: Duncan P. N. Exon Smith
llvm-svn: 245740
This is something like nullopt in std::experimental::optional. Optional
could already be constructed from None, so this seems like an obvious
extension from there.
I have a use in a future patch for Clang, though it may not go that
way/end up used - so this seemed worth committing now regardless.
llvm-svn: 245518
folding the code into the main Analysis library.
There already wasn't much of a distinction between Analysis and IPA.
A number of the passes in Analysis are actually IPA passes, and there
doesn't seem to be any advantage to separating them.
Moreover, it makes it hard to have interactions between analyses that
are both local and interprocedural. In trying to make the Alias Analysis
infrastructure work with the new pass manager, it becomes particularly
awkward to navigate this split.
I've tried to find all the places where we referenced this, but I may
have missed some. I have also adjusted the C API to continue to be
equivalently functional after this change.
Differential Revision: http://reviews.llvm.org/D12075
llvm-svn: 245318
This change makes ScalarEvolution a stand-alone object and just produces
one from a pass as needed. Making this work well requires making the
object movable, using references instead of overwritten pointers in
a number of places, and other refactorings.
I've also wired it up to the new pass manager and added a RUN line to
a test to exercise it under the new pass manager. This includes basic
printing support much like with other analyses.
But there is a big and somewhat scary change here. Prior to this patch
ScalarEvolution was never *actually* invalidated!!! Re-running the pass
just re-wired up the various other analyses and didn't remove any of the
existing entries in the SCEV caches or clear out anything at all. This
might seem OK as everything in SCEV that can uses ValueHandles to track
updates to the values that serve as SCEV keys. However, this still means
that as we ran SCEV over each function in the module, we kept
accumulating more and more SCEVs into the cache. At the end, we would
have a SCEV cache with every value that we ever needed a SCEV for in the
entire module!!! Yowzers. The releaseMemory routine would dump all of
this, but that isn't realy called during normal runs of the pipeline as
far as I can see.
To make matters worse, there *is* actually a key that we don't update
with value handles -- there is a map keyed off of Loop*s. Because
LoopInfo *does* release its memory from run to run, it is entirely
possible to run SCEV over one function, then over another function, and
then lookup a Loop* from the second function but find an entry inserted
for the first function! Ouch.
To make matters still worse, there are plenty of updates that *don't*
trip a value handle. It seems incredibly unlikely that today GVN or
another pass that invalidates SCEV can update values in *just* such
a way that a subsequent run of SCEV will incorrectly find lookups in
a cache, but it is theoretically possible and would be a nightmare to
debug.
With this refactoring, I've fixed all this by actually destroying and
recreating the ScalarEvolution object from run to run. Technically, this
could increase the amount of malloc traffic we see, but then again it is
also technically correct. ;] I don't actually think we're suffering from
tons of malloc traffic from SCEV because if we were, the fact that we
never clear the memory would seem more likely to have come up as an
actual problem before now. So, I've made the simple fix here. If in fact
there are serious issues with too much allocation and deallocation,
I can work on a clever fix that preserves the allocations (while
clearing the data) between each run, but I'd prefer to do that kind of
optimization with a test case / benchmark that shows why we need such
cleverness (and that can test that we actually make it faster). It's
possible that this will make some things faster by making the SCEV
caches have higher locality (due to being significantly smaller) so
until there is a clear benchmark, I think the simple change is best.
Differential Revision: http://reviews.llvm.org/D12063
llvm-svn: 245193
This causes the other special members (like move and copy construction,
and move assignment) to come through for free. Some code in clang was
depending on the (deprecated, in the original code) copy ctor. Now that
there's no user-defined special members, they're all available without
any deprecation concerns.
llvm-svn: 244835
The select pattern recognition in ValueTracking (as used by InstCombine
and SelectionDAGBuilder) only knew about integer patterns. This teaches
it about minimum and maximum operations.
matchSelectPattern() has been extended to return a struct containing the
existing Flavor and a new enum defining the pattern's behavior when
given one NaN operand.
C minnum() is defined to return the non-NaN operand in this case, but
the idiomatic C "a < b ? a : b" would return the NaN operand.
ARM and AArch64 at least have different instructions for these different cases.
llvm-svn: 244580
'llvm::TrailingObjects<`anonymous-namespace'::Class1,short,llvm::NoTrailingTypeArg>::additionalSizeToAlloc' :
cannot access protected member declared in class
'llvm::TrailingObjects<`anonymous-namespace'::Class1,short,llvm::NoTrailingTypeArg>'
I'm not sure how this compiles with gcc.
Aren't protecteded members accessible only with protected or public inheritance?
llvm-svn: 244199
This is the first mechanical step in preparation for making this and all
the other alias analysis passes available to the new pass manager. I'm
factoring out all the totally boring changes I can so I'm moving code
around here with no other changes. I've even minimized the formatting
churn.
I'll reformat and freshen comments on the interface now that its located
in the right place so that the substantive changes don't triger this.
llvm-svn: 244197
This is intended to help support the idiom of a class that has some
other objects (or multiple arrays of different types of objects)
appended on the end, which is used quite heavily in clang.
Differential Revision: http://reviews.llvm.org/D11272
llvm-svn: 244164
For example of mingw-w64-g++-4.8.1,
llvm/unittests/ADT/ArrayRefTest.cpp: In member function 'virtual void {anonymous}::ArrayRefTest_AllocatorCopy_Test::TestBody()':
llvm/unittests/ADT/ArrayRefTest.cpp:56:40: internal compiler error: in count_type_elements, at expr.c:5523
} Array3Src[] = {{"hello"}, {"world"}};
^
Please submit a full bug report,
with preprocessed source if appropriate.
llvm-svn: 244017
Various value handles needed to be copy constructible and copy
assignable (mostly for their use in DenseMap). But to avoid an API that
might allow accidental slicing, make these members protected in the base
class and make derived classes final (the special members become
implicitly public there - but disallowing further derived classes that
might be sliced to the intermediate type).
Might be worth having a warning a bit like -Wnon-virtual-dtor that
catches public move/copy assign/ctors in classes with virtual functions.
(suppressable in the same way - by making them protected in the base,
and making the derived classes final) Could be fancier and only diagnose
them when they're actually called, potentially.
Also allow a few default implementations where custom implementations
(especially with non-standard return types) were implemented.
llvm-svn: 243909
Fixes obvious memory leak in test
TestForEofAfterReadFailureOnDataStreamer. Also removes constexpr use
from same test.
Patch by Karl Schimpf.
Differential Revision: http://reviews.llvm.org/D11735
llvm-svn: 243904