This will avoid link-time error as the defautl constructor of RegisterBankInfo is
the only one available when GlobalISel is not built.
llvm-svn: 265549
when DenseMap growed and moved memory. I verified it fixed the bootstrap
problem on x86_64-linux-gnu but I cannot verify whether it fixes
the bootstrap error on clang-ppc64be-linux. I will watch the build-bot
result closely.
Replace analyzeSiblingValues with new algorithm to fix its compile
time issue. The patch is to solve PR17409 and its duplicates.
analyzeSiblingValues is a N x N complexity algorithm where N is
the number of siblings generated by reg splitting. Although it
causes siginificant compile time issue when N is large, it is also
important for performance since it removes redundent spills and
enables rematerialization.
To solve the compile time issue, the patch removes analyzeSiblingValues
and replaces it with lower cost alternatives containing two parts. The
first part creates a new spill hoisting method in postOptimization of
register allocation. It does spill hoisting at once after all the spills
are generated instead of inside every instance of selectOrSplit. The
second part queries the define expr of the original register for
rematerializaiton and keep it always available during register allocation
even if it is already dead. It deletes those dead instructions only in
postOptimization. With the two parts in the patch, it can remove
analyzeSiblingValues without sacrificing performance.
Differential Revision: http://reviews.llvm.org/D15302
llvm-svn: 265547
Summary:
When the backedge taken codition is computed from an icmp, SCEV can
deduce the backedge taken count only if one of the sides of the icmp
is an AddRecExpr. However, due to sign/zero extensions, we sometimes
end up with something that is not an AddRecExpr.
However, we can use SCEV predicates to produce a 'guarded' expression.
This change adds a method to SCEV to get this expression, and the
SCEV predicate associated with it.
In HowManyGreaterThans and HowManyLessThans we will now add a SCEV
predicate associated with the guarded backedge taken count when the
analyzed SCEV expression is not an AddRecExpr. Note that we only do
this as an alternative to returning a 'CouldNotCompute'.
We use new feature in Loop Access Analysis and LoopVectorize to analyze
and transform more loops.
Reviewers: anemet, mzolotukhin, hfinkel, sanjoy
Subscribers: flyingforyou, mcrosier, atrick, mssimpso, sanjoy, mzolotukhin, llvm-commits
Differential Revision: http://reviews.llvm.org/D17201
llvm-svn: 265535
AArch64InstrInfo::optimizeCompareInstr has a bug which causes generation of incorrect code (PR#27158).
The patch refactors the function to simplify reviewing the fix of the bug.
1. Function name ‘modifiesConditionCode’ is changed to ‘areCFlagsAccessedBetweenInstrs’
to reflect that the function can check modifying accesses, reading accesses or both.
2. Function ‘AArch64InstrInfo::optimizeCompareInstr’
- Documented the function
- Cmp_NZCV is DeadNZCVIdx to reflect that it is an operand index of dead NZCV
- The code for the case of substituting CmpInstr is put into separate
functions the main of them is ‘substituteCmpInstr’.
Differential Revision: http://reviews.llvm.org/D18609
llvm-svn: 265531
To quote the langref "Unlike sqrt in libm, however, llvm.sqrt has
undefined behavior for negative numbers other than -0.0 (which allows
for better optimization, because there is no need to worry about errno
being set). llvm.sqrt(-0.0) is defined to return -0.0 like IEEE sqrt."
This means that it's unsafe to replace sqrt with llvm.sqrt unless the
call is annotated with nnan.
Thanks to Hal Finkel for pointing this out!
llvm-svn: 265521
Instead of copying arguments from the source function to the
destination, steal them. This has a few advantages.
- The ValueMap doesn't need to be seeded with (or cleared of)
Arguments.
- Often the destination function won't have created any arguments yet,
so this avoids malloc traffic.
- Argument names don't need to be copied.
Because argument lists are lazy, this required a new
Function::stealArgumentListFrom helper.
llvm-svn: 265519
r265273 added Mapper::mapBlockAddress, which delays mapping a
blockaddress value until the function has a body. The condition was
backwards, and should be checking Function::empty instead of
GlobalValue::isDeclaration.
llvm-svn: 265508
Instead of crashing, give a nice error. As a drive-by, fix the location
associated with the errors for unresolved metadata (the location was off
by one token).
llvm-svn: 265507
This patch enable sibling call optimization on ppc64 ELFv1/ELFv2 abi, and
add a couple of test cases. This patch also passed llvm/clang bootstrap
test, and spec2006 build/run/result validation.
Original issue: https://llvm.org/bugs/show_bug.cgi?id=25617
Great thanks to Tom's (tjablin) help, he contributed a lot to this patch.
Thanks Hal and Kit's invaluable opinions!
Reviewers: hfinkel kbarton
http://reviews.llvm.org/D16315
llvm-svn: 265506
This patch implements the following BookII and Book III instructions:
- copy copy_first cp_abort paste paste. paste_last
- msgsync
- slbieg slbsync
- stop
Total 10 instructions
Reviewers: nemanjai hfinkel tjablin amehsan kbarton
llvm-svn: 265504
While preserving the return value for @llvm.experimental.deoptimize at
the IR level is useful during mid-level optimization, doing so at the
machine instruction level requires generating some extra code and a
return that is non-ideal. This change has LLVM lower
```
%val = call @llvm.experimental.deoptimize
ret %val
```
to effectively
```
call @__llvm_deoptimize()
unreachable
```
instead.
llvm-svn: 265502
As part of the TRI argument of addRegBankCoverage we already have access to
the TargetRegisterClass through the ID of that register class.
Therefore, there is no point in needing a TargetRegisterClass instance,
the ID is enough to get to it.
llvm-svn: 265487
Don't emit a gc.result for a statepoint lowered from
@llvm.experimental.deoptimize since the call into __llvm_deoptimize is
effectively noreturn. Instead follow the corresponding gc.statepoint
with an "unreachable".
llvm-svn: 265485
Bionic has a defined thread-local location for the stack protector
cookie. Emit a direct load instead of going through __stack_chk_guard.
llvm-svn: 265481
Prior to this patch, CFLAA wouldn't tag arguments/globals properly if
it didn't find any "interesting" edges on them. This means that, if all
you do is store constants to a global or argument, we would never
actually treat it as a global/argument.
Test case:
define void @foo(i32* %A, i32* %B) #0 {
entry:
store i32 0, i32* %A, align 4
store i32 0, i32* %B, align 4
ret void
}
CFLAA would say that %A can't alias %B, because neither pointer was
used in an interesting way. This patch makes us note whether something
is an argument, global, ... regardless of how interesting CFLAA thinks
its uses are.
(For the record, using a value in an interesting way means loading
from it, using it in a GEP, ...)
llvm-svn: 265474
Add a common parent class for ConstantArray, ConstantVector, and
ConstantStruct called ConstantAggregate. These are the aggregate
subclasses of Constant that take operands.
This is mainly a cleanup, adding common `isa` target and removing
duplicated code. However, it also simplifies caching which constants
point transitively at `GlobalValue` (a possible future direction).
llvm-svn: 265466
Change the default constructor to create invalid object.
The target will have to properly initialize the register banks before
using them.
llvm-svn: 265460
This commit completely rewrites Mapper::mapMetadata (the implementation
of llvm::MapMetadata) using an iterative algorithm. The guts of the new
algorithm are in MDNodeMapper::map, the entry function in a new class.
Previously, Mapper::mapMetadata performed a recursive exploration of the
graph with eager "just in case there's a reason" malloc traffic.
The new algorithm has these benefits:
- New nodes and temporaries are not created eagerly.
- Uniquing cycles are not duplicated (see new unit test).
- No recursion.
Given a node to map, it does this:
1. Use a worklist to perform a post-order traversal of the transitively
referenced unmapped nodes.
2. Track which nodes will change operands, and which will have new
addresses in the mapped scheme. Propagate the changes through the
POT until fixed point, to pick up uniquing cycles that need to
change.
3. Map all the distinct nodes without touching their operands. If
RF_MoveDistinctMetadata, they get mapped to themselves; otherwise,
they get mapped to clones.
4. Map the uniqued nodes (bottom-up), lazily creating temporaries for
forward references as needed.
5. Remap the operands of the distinct nodes.
Mehdi helped me out by profiling this with -flto=thin. On his workload
(importing/etc. for opt.cpp), MapMetadata sped up by 15%, contributed
about 50% less to persistent memory, and made about 100x fewer calls to
malloc. The speedup is less than I'd hoped. The profile mainly blames
DenseMap lookups; perhaps there's a way to reduce them (e.g., by
disallowing remapping of MDString).
It would be nice to break the strange remaining recursion on the Value
side: MapValue => materializeInitFor => RemapInstruction => MapValue. I
think we could do this by having materializeInitFor return a worklist of
things to be remapped.
llvm-svn: 265456
Some Include What You Use suggestions were used too.
Use anonymous namespaces in source files.
Differential revision: http://reviews.llvm.org/D18778
llvm-svn: 265454
We only generate LOCKed versions of add/sub when the result is unused.
It often happens that the result is used, but only by a comparison. We
can optimize those out by reusing EFLAGS, which lets us use the proper
instructions, instead of having to fallback to LXADD.
Instead of doing this as an MI peephole (as we do for the other
non-LOCKed (really, non-MR) forms), do it in ISel. It becomes quite
tricky later.
This also makes it eventually possible to stop expanding and/or/xor
if the only user is an icmp (also see D18141).
This uses the LOCK ISD opcodes added by r262244.
Differential Revision: http://reviews.llvm.org/D17633
llvm-svn: 265450
At IR level, the swifterror argument is an input argument with type
ErrorObject**. For targets that support swifterror, we want to optimize it
to behave as an inout value with type ErrorObject*; it will be passed in a
fixed physical register.
The main idea is to track the virtual registers for each swifterror value. We
define swifterror values as AllocaInsts with swifterror attribute or a function
argument with swifterror attribute.
In SelectionDAGISel.cpp, we set up swifterror values (SwiftErrorVals) before
handling the basic blocks.
When iterating over all basic blocks in RPO, before actually visiting the basic
block, we call mergeIncomingSwiftErrors to merge incoming swifterror values when
there are multiple predecessors or to simply propagate them. There, we create a
virtual register for each swifterror value in the entry block. For predecessors
that are not yet visited, we create virtual registers to hold the swifterror
values at the end of the predecessor. The assignments are saved in
SwiftErrorWorklist and will be materialized at the end of visiting the basic
block.
When visiting a load from a swifterror value, we copy from the current virtual
register assignment. When visiting a store to a swifterror value, we create a
virtual register to hold the swifterror value and update SwiftErrorMap to
track the current virtual register assignment.
Differential Revision: http://reviews.llvm.org/D18108
llvm-svn: 265433
Summary: LanaiSetflagAluCombiner could previously combine instructions across basic building blocks even when not legal. Make the LanaiSetflagAluCombiner more conservative to avoid this.
Reviewers: eliben
Subscribers: joker.eph, llvm-commits
Differential Revision: http://reviews.llvm.org/D18746
llvm-svn: 265411
Removed the SDNode argument passed to the AI_smul and AI_smla multiclass
definitions as they are always mul.
Differential Revision: http://reviews.llvm.org/D18791
llvm-svn: 265409
Presently, CodeGenPrepare deletes all nearly empty (only phi and branch)
basic blocks. This pass can delete loop preheaders which frequently creates
critical edges. A preheader can be a convenient place to spill registers to
the stack. If the entrance to a loop body is a critical edge, then spills
may occur in the loop body rather than immediately before it. This patch
protects loop preheaders from deletion in CodeGenPrepare even if they are
nearly empty.
Since the patch alters the CFG, it affects a large number of test cases.
In most cases, the changes are merely cosmetic (basic blocks have different
names or instruction orders change slightly). I am somewhat concerned about
the test/CodeGen/Mips/brdelayslot.ll test case. If the loop preheader is not
deleted, then the MIPS backend does not take advantage of a branch delay
slot. Consequently, I would like some close review by a MIPS expert.
The patch also partially subsumes D16893 from George Burgess IV. George
correctly notes that CodeGenPrepare does not actually preserve the dominator
tree. I think the dominator tree was usually not valid when CodeGenPrepare
ran, but I am using LoopInfo to mark preheaders, so the dominator tree is
now always valid before CodeGenPrepare.
Author: Tom Jablin (tjablin)
Reviewers: hfinkel george.burgess.iv vkalintiris dsanders kbarton cycheng
http://reviews.llvm.org/D16984
llvm-svn: 265397
This patch adds support for compact jumps similiar to the previous compact
branch support for MIPSR6. Unlike compact branches, compact jumps do not
have a forbidden slot.
As MipsInstrInfo::getEquivalentCompactForm can determine the correct
expansion for jumps and branches for both microMIPS and MIPSR6, remove the
unnecessary distinction in the delay slot filler.
Reviewers: vkalintiris
Subscribers: llvm-commits, dsanders
llvm-svn: 265390
Refactor common code that queries the ModuleSummaryIndex for a value's
GlobalValueInfo struct into getGlobalValueInfo helper methods, which
will also be used by D18763.
llvm-svn: 265370
Summary:
I encountered this issue when constant folding during inlining tried to
fold away a bitcast of a double to an x86_mmx, which is not an integral
type. The test case exposes the same issue with a smaller code snippet
during early CSE.
Subscribers: llvm-commits
Differential Revision: http://reviews.llvm.org/D18528
llvm-svn: 265367
We were using array_pod_sort on an array of type 'Attribute', which
wraps a pointer to AttributeImpl. For the most part this didn't matter
because the printing code prints enum attributes in a defined order, but
integer attributes such as 'align' and 'dereferenceable' were not
ordered.
Furthermore, AttributeImpl::operator< was broken for integer attributes.
An integer attribute is a kind and an integer value, and both pieces
need to be compared.
By fixing the comparison operator, we can go back to std::sort, and
things look good now. This should fix clang arm-swiftcall.c test
failures on Windows.
llvm-svn: 265361
There is no problem with the code today, but the fix will avoid a crash
in test/CodeGen/AMDGPU/subreg-coalescer-undef-use.ll once the
DetectDeadLanes pass is added.
llvm-svn: 265351
Remove a default parameter value being passed unnecessarily, which
also reduces the changes required when this parameter is changed in
D18763.
Document the remaining non-default bool value passed for another
parameter.
llvm-svn: 265348
I noticed that this isn't covered by our existing tests and spent some
time trying to come up with an example it actually hits. I tried hand
rolling something based on the explanation in the comment, but couldn't
get anything that didn't abort tail duplication earlier for one reason
or another.
Then, I tried cranking tail-dup-size cranked up so this would fire
more and ran a bootstrap of clang and the nightly test suite - those
don't hit this either.
This reverts r132816 and replaces it with an assert.
llvm-svn: 265347
The original commit miscompiled things on 32-bit Windows, e.g. a Clang
boostrap. It turns out that mergeSPUpdates() was a bit too generous in
what it interpreted as a stack adjustment, causing the following code:
addl $12, %esp
leal -4(%ebp), %esp
To be "optimized" into simply:
addl $8, %esp
This commit tightens up mergeSPUpdates() and includes a new test
(test14 in movtopush.ll) for this situation.
llvm-svn: 265345
That commit looks wonderful and awesome. Sadly, it greatly exacerbates
PR17409 and effectively regresses build time for a lot of (very large)
code when compiled with ASan or MSan.
We thought this could be fixed forward by landing D15302 which at last
fixes that PR, but some issues were discovered and it looks like that
got reverted, so reverting this as well temporarily. As soon as the fix
for PR17409 lands and sticks, we should re-land this patch as it won't
trigger more significant test cases hitting that bug.
Many thanks to Quentin and Wei here as they're doing all the awesome
hard work!!!
llvm-svn: 265331
Direct callees' that are cast to other function prototypes,
show up in the Call/Invoke instructions as ConstantExpr's.
Currently llvm::CallSite's getCalledFunction() fails
to return the callees in such expressions as direct calls.
Value profiling should avoid instrumenting such cases. Mostly NFC.
llvm-svn: 265330
We can only perform a tail call to a callee that preserves all the
registers that the caller needs to preserve.
This situation happens with calling conventions like preserver_mostcc or
cxx_fast_tls. It was explicitely handled for fast_tls and failing for
preserve_most. This patch generalizes the check to any calling
convention.
Related to rdar://24207743
Differential Revision: http://reviews.llvm.org/D18680
llvm-svn: 265329
Summary:
Useful for debugging since we lose this correlation after the permodule
summary/VST is read and until we later materialize source modules in the
function importer.
Reviewers: joker.eph
Subscribers: llvm-commits, joker.eph
Differential Revision: http://reviews.llvm.org/D18555
llvm-svn: 265327
Summary:
To aid in debugging, dump out the correlation between value names and
GUID for each source module when it is materialized. This will make it
easier to comprehend the earlier summary-based function importing debug
trace which only has access to and prints the GUIDs.
Reviewers: joker.eph
Subscribers: llvm-commits, joker.eph
Differential Revision: http://reviews.llvm.org/D18556
llvm-svn: 265326
A seg-fault occurs due to a reference of a null pointer, which is
the value returned by getConstantPart. This function returns
null if the constant part is not found. The code that calls this
function needs to check for the null return value.
Differential Revision: http://reviews.llvm.org/D18718
llvm-svn: 265319
Use the MachineFunctionProperty mechanism to indicate whether a MachineFunction
is in SSA form instead of a custom method on MachineRegisterInfo. NFC
Differential Revision: http://reviews.llvm.org/D18574
llvm-svn: 265318
Summary:
This adds the same checks that were added in r264593 to all
target-specific passes that run after register allocation.
Reviewers: qcolombet
Subscribers: jyknight, dsanders, llvm-commits
Differential Revision: http://reviews.llvm.org/D18525
llvm-svn: 265313
time issue. The patch is to solve PR17409 and its duplicates.
analyzeSiblingValues is a N x N complexity algorithm where N is
the number of siblings generated by reg splitting. Although it
causes siginificant compile time issue when N is large, it is also
important for performance since it removes redundent spills and
enables rematerialization.
To solve the compile time issue, the patch removes analyzeSiblingValues
and replaces it with lower cost alternatives containing two parts. The
first part creates a new spill hoisting method in postOptimization of
register allocation. It does spill hoisting at once after all the spills
are generated instead of inside every instance of selectOrSplit. The
second part queries the define expr of the original register for
rematerializaiton and keep it always available during register allocation
even if it is already dead. It deletes those dead instructions only in
postOptimization. With the two parts in the patch, it can remove
analyzeSiblingValues without sacrificing performance.
Differential Revision: http://reviews.llvm.org/D15302
llvm-svn: 265309
Summary:
At this point we should be able to enable IAS by default for O32 without
breaking check-all, or recursion.
Reviewers: vkalintiris
Subscribers: dsanders, llvm-commits
Differential Revision: http://reviews.llvm.org/D18439
llvm-svn: 265302
This adds MC support for fused compare + indirect branch instructions,
ie. CRB, CGRB, CLRB, CLGRB, CIB, CGIB, CLIB, CLGIB. They aren't actually
generated yet -- this is preparation for their use for conditional
returns in the next iteration of D17339.
Author: koriakin
Differential Revision: http://reviews.llvm.org/D18742
llvm-svn: 265296
A cross-thread sequentially consistent fence should be lowered into
z/Architecture's BCR serialization instruction, instead of causing a
fatal error in the back-end.
Author: bryanpkc
Differential Revision: http://reviews.llvm.org/D18644
llvm-svn: 265292
Enable the SystemZ back-end to lower FRAMEADDR and RETURNADDR, which
previously would cause the back-end to crash. Currently, only a
frame count of zero is supported.
Author: bryanpkc
Differential Revision: http://reviews.llvm.org/D18514
llvm-svn: 265291
Implemented truncstore for KNL and skylake-avx512.
Covered vectors from v2i1 to v64i1. We save the value in bits (not in bytes) - v32i1 is saved in 4 bytes.
Differential Revision: http://reviews.llvm.org/D18740
llvm-svn: 265283
Remove a few old FIXMEs from the original commit of the Metadata/Value
split in r223802. These are commented out assertions to the effect that
calls between mapValue and mapMetadata never return nullptr.
(The only behaviour change is that Mapper::mapSimpleMetadata memoizes
the nullptr return.)
When I originally rewrote the mapping code, I thought we could be
stricter in the new metadata hierarchy and never return nullptr when
RF_NullMapMissingGlobalValues was off. It's still not entirely clear to
me why these assertions failed (a few months ago, I had a theory that I
forgot to write down, but that's helping no one).
Understood or not, I no longer see how these commented-out assertions
would be useful. I'm relegating them to the annals of source control
before making significant changes to ValueMapper.cpp.
llvm-svn: 265282
RAUW support on MDNode usually requires an extra allocation for
ReplaceableMetadataImpl. This is only strictly necessary if there are
tracking references to the MDNode. Make the construction of
ReplaceableMetadataImpl lazy, so that we don't get allocations if we
don't need them.
Since MDNode::isResolved now checks MDNode::isTemporary and
MDNode::NumUnresolved instead of whether a ReplaceableMetadataImpl is
allocated, the internal changes are intrusive (at various internal
checkpoints, isResolved now has a different answer).
However, there should be no real functionality change here; just
slightly lazier allocation behaviour. The external semantics should be
identical.
llvm-svn: 265279
This adds an assertion to maintain the property from r265273. When
Mapper::mapSimpleMetadata calls Mapper::mapValue, it should not find its
way back to mapMetadataImpl. This guarantees that mapSimpleMetadata is
not involved in any recursion.
Since Mapper::mapValue calls out to arbitrary materializers, we need to
save a bit on the ValueMap to make this assertion effective.
There should be no functionality change here. This co-recursion should
already have been impossible.
llvm-svn: 265276
The main change is to delay materializing GlobalValue initializers from
Mapper::mapValue until Mapper::~Mapper. This effectively removes all
recursion from mapSimplifiedMetadata, as promised in r265270.
mapSimplifiedMetadata calls mapValue for ConstantAsMetadata nodes to
find the mapped constant, and now it shouldn't be possible for mapValue
to indirectly re-invoke mapMetadata. I'll add an assertion to that
effect in a follow-up (separated so that the assertion can easily be
reverted independently, if it comes to that).
This a step toward a broader goal: converting Mapper::mapMetadataImpl
from a recursive to an iterative algorithm.
When a BlockAddress points at a BasicBlock inside an unmaterialized
function body, we need to delay it until the function body is
materialized in Mapper::~Mapper. This commit creates a temporary
BasicBlock and returns a new BlockAddress, then RAUWs the BasicBlock
once it is known. This situation should be extremely rare since a
BlockAddress is usually used from within the function it's referencing
(and BlockAddress itself is rare).
There should be no observable functionality change.
llvm-svn: 265273
Split out a helper for mapping metadata without operands. This is any
metadata that is not an MDNode, and any MDNode where the answer is known
without looking at operands.
Through some weird twists, this function is co-recursive:
mapSimpleMetadata
=> MapValue
=> materializeInitFor
=> linkFunctionBody
=> RemapInstructions
=> MapMetadata
=> mapSimpleMetadata
I plan to break the recursion in a follow-up.
llvm-svn: 265270
Add support for lowering with the MOVMSK instruction to extract vector element signbits to a GPR.
This is an early step towards more optimal handling of vector comparison results.
Differential Revision: http://reviews.llvm.org/D18741
llvm-svn: 265266
Sinking comparisons in CGP can undo the job of hoisting them done
earlier by LICM, and soft-FP makes this an expensive mistake.
A common pattern that produces floating point comparisons uniform
over a loop is an explicit check for division by zero. If the divisor
is hoisted out of the loop, the comparison can also be, but hoisting
the function that unwinds is never legal, since it may cause side
effects in the loop body prior to the unwinding to not be executed.
Differential Revision: http://reviews.llvm.org/D18744
llvm-svn: 265264
Tidied up comments, stripped trailing whitespace, split apart nodes that aren't related.
No change in ordering although there is definitely some scope for it.
llvm-svn: 265263
Floating point intrinsics in LLVM are generally not speculatively
executed, since most of them are defined to behave the same as libm
functions, which set errno.
However, the only error that can happen when executing ceil, floor,
nearbyint, rint and round libm functions per POSIX.1-2001 is -ERANGE,
and that requires the maximum value of the exponent to be smaller
than the number of mantissa bits, which is not the case with any of
the floating point types supported by LLVM.
The trunc and copysign functions never set errno per per POSIX.1-2001.
Differential Revision: http://reviews.llvm.org/D18643
llvm-svn: 265262
We already skip optimizations if the return value
of printf() is used, so CI->use_empty() is always
true.
Differential Revision: http://reviews.llvm.org/D18656
llvm-svn: 265253
Summary:
* Fix to stop delay slot filler from inserting SP modifying instructions in the newly expanded call/return instructions.
* In LowerSymbol the outermost type was not LanaiMCExpr if there was a binary expression
* Remove printExpr in LanaiInstPrinter
Subscribers: joker.eph, llvm-commits
Differential Revision: http://reviews.llvm.org/D18734
llvm-svn: 265251
Add support for the AArch64 .cpu directive. This is a slightly involved
directive since the parameter is actually a variable encoded string. The
general structure is:
<cpu>[[+-]<feature>]*
We now map some of the supported string names for features for internal
representation of feature flags. If we encounter one which we do not support,
bail out as we cannot validate the assembly any longer.
Resolves PR27010.
llvm-svn: 265240
Split the loop through compile units in mapUnneededSubprograms in two.
First, visit imported entities to ensure that we've visited all need
subprograms. Second, visit subprograms, and drop the ones we don't
need.
Hypothetically this protects against a subprogram from one compile unit
being referenced from an imported entity in a different compile unit. I
don't think that's valid IR (a debug info expert could confirm), but I
think the refactor makes the code more clear.
llvm-svn: 265233
IRLinker::mapUnneededSubprograms has to be sure that any "needed"
subprograms get linked in. Rather than traversing through imported
entities using llvm::getSubprogram, call MapMetadata. The latter
memoizes the result in the ValueMap (sharing work with
IRLinker::linkNamedMDNodes proper), and makes the local SmallPtrSet
redundant.
llvm-svn: 265231
Instead of checking live during MapMetadata whether a subprogram is
needed, seed the ValueMap with `nullptr` up-front.
There is a small hypothetical functionality change. Previously, calling
MapMetadataOp on a node whose "scope:" chain led to an unneeded
subprogram would return nullptr. However, if that were ever called,
then the subprogram would be needed; a situation that the IRMover is
supposed to avoid a priori!
Besides cleaning up the code a little, this restores a nice property:
MapMetadataOp returns the same as MapMetadata.
llvm-svn: 265229
Support seeding a ValueMap with nullptr for Metadata entries, a
situation I didn't consider in the Metadata/Value split.
I added a ValueMapper::getMappedMD accessor that returns an
Optional<Metadata*> with the mapped (possibly null) metadata. IRMover
needs to use this to avoid modifying the map when it's checking for
unneeded subprograms. I updated a call from bugpoint since I find the
new code clearer.
llvm-svn: 265228
Whenever metadata is only referenced by a single function, emit the
metadata just in that function block. This should improve lazy-loading
by reducing the amount of metadata in the global block.
For now, this should catch all DILocations, and anything else that
happens to be referenced only by a single function.
It's also a first step toward a couple of possible future directions
(which this commit does *not* implement):
1. Some debug info metadata is only referenced from compile units and
individual functions. If we can drop the link from the compile
unit, this optimization will get more powerful.
2. Any uniqued metadata that isn't referenced globally can in theory be
emitted in every function block that references it (trading off
bitcode size and full-parse time vs. lazy-load time).
Note: this assumes the new BitcodeReader error checking from r265223.
The metadata stored in function blocks gets purged after parsing each
function, which means unresolved forward references will get lost.
Since all the global metadata should have already been resolved by the
time we get to the function metadata blocks we just need to check for
that case. (If for some reason we need to handle bitcode that fails the
checks in r265223, the fix is to store about-to-be-dropped unresolved
nodes in MetadataList::shrinkTo until they can be handled succesfully by
a future call to MetadataList::tryToResolveCycles.)
llvm-svn: 265226
Further unify the handling of function-local metadata with global
metadata, by exposing the same interface in ValueEnumerator. Both
contexts use the same accessors:
- getMDStrings(): get the strings for this block.
- getNonMDStrings(): get the non-strings for this block.
A future commit will start adding strings to the function-block.
llvm-svn: 265224
A follow-up commit will start using function metadata blocks more
heavily. This commit adds some error checking to confirm that metadata
is fully resolved before (and after) materializing each function.
This is valid even when reading very old bitcode from before the
metadata/value split. The global metadata block always came before the
function blocks. However, in case somehow this causes a regression
(i.e., an old LLVM did produce such bitcode after all) I'm committing
separately.
llvm-svn: 265223
Summary: This should make the code more readable, especially all the map declarations.
Reviewers: tejohnson
Subscribers: llvm-commits
Differential Revision: http://reviews.llvm.org/D18721
From: Mehdi Amini <mehdi.amini@apple.com>
llvm-svn: 265215
Incremental LTO will usea cache to store object files.
This patch handles the pruning part of the cache, exposing
a few knobs:
- Pruning interval: the implementation keeps a "timestamp" file in the
directory and will scan it only after a given interval since the
last modification of the timestamp file. This is for performance
purpose, we don't want to scan continuously the folder.
- Entry expiration: this is the time after which a file that hasn't
been used is remove from the cache.
- Maximum size: expressed in percentage of the available disk space,
it helps to avoid that we blow up the disk space.
http://reviews.llvm.org/D18422
From: Mehdi Amini <mehdi.amini@apple.com>
llvm-svn: 265209
Use a helper function to find all the direct-calls-sites in a function.
Also split the code into a separated file as this will be use by
indirect-call-promotion transformation.
Differential Revision: http://reviews.llvm.org/D18704
llvm-svn: 265199
This fixes various symbolization test failures for me when I build with a
hermetic VS2015 without having run the 2015 installer.
http://reviews.llvm.org/D18707
llvm-svn: 265193
These function can be dropped by the compiler if they are no longer
referenced in the current module. However there is a change that
another module is still referencing them because of the import.
Multiple solutions can be used:
- Always import LinkOnce when a caller is imported. This ensure that
every module with a call to a LinkOnce has the definition and will
be able to emit it if it emits the call.
- Turn the LinkOnce into Weak, so that it is always emitted.
- Turn all LinkOnce into available_externally and come back after all
modules are codegen'ed to emit only one copy of the linkonce, when
there is still a reference to it.
This patch implement the second option, with am optimization that
only *one* module will turn the LinkOnce into Weak, while the others
will turn it into available_externally, so that there is exactly one
copy emitted for the whole compilation.
http://reviews.llvm.org/D18346
From: Mehdi Amini <mehdi.amini@apple.com>
llvm-svn: 265190
A ``swifterror`` attribute can be applied to a function parameter or an
AllocaInst.
This commit does not include any target-specific change. The target-specific
optimization will come as a follow-up patch.
Differential Revision: http://reviews.llvm.org/D18092
llvm-svn: 265189
ThreadModel::Single is already handled already by ARMPassConfig adding
LowerAtomicPass to the pass list, which lowers all atomics to non-atomic
ops and deletes fences.
So by the time we get to ISel, there's no atomic fences left, so they
don't need special handling.
llvm-svn: 265178