This is a very limited implementation of DFG-based copy propagation.
It only handles actual COPY instructions (does not handle other equivalents
such as add-immediate with a 0 operand).
The major limitation is that it does not update the DFG: that will be the
change required to make it more robust (hopefully coming up soon).
llvm-svn: 257490
Target independent, SSA-based data flow framework for representing
data flow between physical registers.
This commit implements the creation of the actual data flow graph.
llvm-svn: 257477
This restores the previous behavior of not including the mnemonic in the classes table for every target that starts instruction lines with the mnemonic. Not only did the table size increase by 1 entry, but the class enum increased in size which caused every class in the array to increase in size. It also grew the size of the function that parsers tokens into classes by a substantial amount.
This adds a new HasMnemonicFirst flag to all AsmParsers. It's set to 1 by default and Hexagon target overrides it to 0.
For the X86 target alone this recovers 324KB of size on the llvm-mc executable.
I believe the current state is still a bad design choice for the Hexagon target as it causes most of the parsing to do a linear search through the entire match table to comparing operands against every instruction until it finds one that works. At least for the other targets we do a binary search based on mnemonic over which to do the linear scan.
llvm-svn: 256669
This removes an unpleasant hack involving a global variable for special
lowering of certain memcpy calls. These are now lowered as intended in
EmitTargetCodeForMemcpy in the same way that other targets do it.
llvm-svn: 255785
This will make the depedence graph more accurate if an alias analysis
is provided. If nullptr is specified in its place, the behavior will
remain as it is currently.
llvm-svn: 255540
This patch adds some missing calls to MBB::normalizeSuccProbs() in several
locations where it should be called. Those places are found by checking if the
sum of successors' probabilities is approximate one in MachineBlockPlacement
pass with some instrumented code (not in this patch).
Differential revision: http://reviews.llvm.org/D15259
llvm-svn: 255455
std::hex is not used anywhere in LLVM code base except for this place,
and it has a known undefined behavior (at least in libstdc++ 4.9.3):
https://llvm.org/bugs/show_bug.cgi?id=18156, which fires in UBSan
bootstrap of LLVM.
llvm-svn: 254547
The ARM ARM is clear that 128-bit loads are only guaranteed to have been atomic
if there has been a corresponding successful stxp. It's less clear for AArch32, so
I'm leaving that alone for now.
llvm-svn: 254524
(This is the second attempt to submit this patch. The first caused two assertion
failures and was reverted. See https://llvm.org/bugs/show_bug.cgi?id=25687)
The patch in http://reviews.llvm.org/D13745 is broken into four parts:
1. New interfaces without functional changes (http://reviews.llvm.org/D13908).
2. Use new interfaces in SelectionDAG, while in other passes treat probabilities
as weights (http://reviews.llvm.org/D14361).
3. Use new interfaces in all other passes.
4. Remove old interfaces.
This patch is 3+4 above. In this patch, MBB won't provide weight-based
interfaces any more, which are totally replaced by probability-based ones.
The interface addSuccessor() is redesigned so that the default probability is
unknown. We allow unknown probabilities but don't allow using it together
with known probabilities in successor list. That is to say, we either have a
list of successors with all known probabilities, or all unknown
probabilities. In the latter case, we assume each successor has 1/N
probability where N is the number of successors. An assertion checks if the
user is attempting to add a successor with the disallowed mixed use as stated
above. This can help us catch many misuses.
All uses of weight-based interfaces are now updated to use probability-based
ones.
Differential revision: http://reviews.llvm.org/D14973
llvm-svn: 254377
and the follow-up r254356: "Fix a bug in MachineBlockPlacement that may cause assertion failure during BranchProbability construction."
Asserts were firing in Chromium builds. See PR25687.
llvm-svn: 254366
The patch in http://reviews.llvm.org/D13745 is broken into four parts:
1. New interfaces without functional changes (http://reviews.llvm.org/D13908).
2. Use new interfaces in SelectionDAG, while in other passes treat probabilities
as weights (http://reviews.llvm.org/D14361).
3. Use new interfaces in all other passes.
4. Remove old interfaces.
This patch is 3+4 above. In this patch, MBB won't provide weight-based
interfaces any more, which are totally replaced by probability-based ones.
The interface addSuccessor() is redesigned so that the default probability is
unknown. We allow unknown probabilities but don't allow using it together
with known probabilities in successor list. That is to say, we either have a
list of successors with all known probabilities, or all unknown
probabilities. In the latter case, we assume each successor has 1/N
probability where N is the number of successors. An assertion checks if the
user is attempting to add a successor with the disallowed mixed use as stated
above. This can help us catch many misuses.
All uses of weight-based interfaces are now updated to use probability-based
ones.
Differential revision: http://reviews.llvm.org/D14973
llvm-svn: 254348
This is a temporary fix to address ICE on 2005-10-21-longlonggtu.ll.
The proper fix will be to use A2_tfrsi, but it will need more work to
teach all users of A2_tfrsi to also expect a floating-point operand.
llvm-svn: 254099
Extended DFA tablegen to:
- added "-debug-only dfa-emitter" support to llvm-tblgen
- defined CVI_PIPE* resources for the V60 vector coprocessor
- allow specification of multiple required resources
- supports ANDs of ORs
- e.g. [SLOT2, SLOT3], [CVI_MPY0, CVI_MPY1] means:
(SLOT2 OR SLOT3) AND (CVI_MPY0 OR CVI_MPY1)
- added support for combo resources
- allows specifying ORs of ANDs
- e.g. [CVI_XLSHF, CVI_MPY01] means:
(CVI_XLANE AND CVI_SHIFT) OR (CVI_MPY0 AND CVI_MPY1)
- increased DFA input size from 32-bit to 64-bit
- allows for a maximum of 4 AND'ed terms of 16 resources
- supported expressions now include:
expression => term [AND term] [AND term] [AND term]
term => resource [OR resource]*
resource => one_resource | combo_resource
combo_resource => (one_resource [AND one_resource]*)
Author: Dan Palermo <dpalermo@codeaurora.org>
kparzysz: Verified AMDGPU codegen to be unchanged on all llc
tests, except those dealing with instruction encodings.
Reapply the previous patch, this time without circular dependencies.
llvm-svn: 253793
Extended DFA tablegen to:
- added "-debug-only dfa-emitter" support to llvm-tblgen
- defined CVI_PIPE* resources for the V60 vector coprocessor
- allow specification of multiple required resources
- supports ANDs of ORs
- e.g. [SLOT2, SLOT3], [CVI_MPY0, CVI_MPY1] means:
(SLOT2 OR SLOT3) AND (CVI_MPY0 OR CVI_MPY1)
- added support for combo resources
- allows specifying ORs of ANDs
- e.g. [CVI_XLSHF, CVI_MPY01] means:
(CVI_XLANE AND CVI_SHIFT) OR (CVI_MPY0 AND CVI_MPY1)
- increased DFA input size from 32-bit to 64-bit
- allows for a maximum of 4 AND'ed terms of 16 resources
- supported expressions now include:
expression => term [AND term] [AND term] [AND term]
term => resource [OR resource]*
resource => one_resource | combo_resource
combo_resource => (one_resource [AND one_resource]*)
Author: Dan Palermo <dpalermo@codeaurora.org>
kparzysz: Verified AMDGPU codegen to be unchanged on all llc
tests, except those dealing with instruction encodings.
llvm-svn: 253790
If a section is rw, it is irrelevant if the dynamic linker will write to
it or not.
It looks like llvm implemented this because gcc was doing it. It looks
like gcc implemented this in the hope that it would put all the
relocated items close together and speed up the dynamic linker.
There are two problem with this:
* It doesn't work. Both bfd and gold will map .data.rel to .data and
concatenate the input sections in the order they are seen.
* If we want a feature like that, it can be implemented directly in the
linker since it knowns where the dynamic relocations are.
llvm-svn: 253436
MCRelaxableFragment previously kept a copy of MCSubtargetInfo and
MCInst to enable re-encoding the MCInst later during relaxation. A copy
of MCSubtargetInfo (instead of a reference or pointer) was needed
because the feature bits could be modified by the parser.
This commit replaces the MCSubtargetInfo copy in MCRelaxableFragment
with a constant reference to MCSubtargetInfo. The copies of
MCSubtargetInfo are kept in MCContext, and the target parsers are now
responsible for asking MCContext to provide a copy whenever the feature
bits of MCSubtargetInfo have to be toggled.
With this patch, I saw a 4% reduction in peak memory usage when I
compiled verify-uselistorder.lto.bc using llc.
rdar://problem/21736951
Differential Revision: http://reviews.llvm.org/D14346
llvm-svn: 253127
MCSubtargetInfo in the subclasses into MCTargetAsmParser and define a
member function getSTI.
This is done in preparation for making changes to shrink the size of
MCRelaxableFragment. (see http://reviews.llvm.org/D14346).
llvm-svn: 253124
Summary:
The CLR's personality routine passes these in rdx/edx, not rax/eax.
Make getExceptionPointerRegister a virtual method parameterized by
personality function to allow making this distinction.
Similarly make getExceptionSelectorRegister a virtual method parameterized
by personality function, for symmetry.
Reviewers: pgavlin, majnemer, rnk
Subscribers: jyknight, dsanders, llvm-commits
Differential Revision: http://reviews.llvm.org/D14344
llvm-svn: 252383
Some implicit ilist iterator conversions have crept back into Analysis,
Transforms, Hexagon, and llvm-stress. This removes them.
I'll commit a patch immediately after this to disallow them (in a
separate patch so that it's easy to revert if necessary).
llvm-svn: 252371
There are two things out of the ordinary in this commit. First, I made
a loop obviously "infinite" in HexagonInstrInfo.cpp. After checking if
an instruction was at the beginning of a basic block (in which case,
`break`), the loop decremented and checked the iterator for `nullptr` as
the loop condition. This has never been possible (the prev pointers are
always been circular, so even with the weird ilist/iplist
implementation, this isn't been possible), so I removed the condition.
Second, in HexagonAsmPrinter.cpp there was another case of comparing a
`MachineBasicBlock::instr_iterator` against `MachineBasicBlock::end()`
(which returns `MachineBasicBlock::iterator`). While not incorrect,
it's fragile. I switched this to `::instr_end()`.
All that said, no functionality change intended here.
llvm-svn: 250778
- Isolate the check for the existence of a stack frame into hasFP.
- Implement getFrameIndexReference for DWARF address computation.
- Use getFrameIndexReference for offset computation in eliminateFrameIndex.
- Preserve debug information for dynamically allocated stack objects.
- Prefer FP to access local objects at -O0.
- Add experimental code to skip allocframe when not strictly necessary
(disabled by default).
llvm-svn: 250718
Emit the CFI instructions after all code transformation have been done.
This will avoid any interference between CFI instructions and packetization.
llvm-svn: 250714
This extends the work done in r233995 so that now getFragment (in addition to
getSection) also works for variable symbols.
With that the existing logic to decide if a-b can be computed works even if
a or b are variables. Given that, the expression evaluation can avoid expanding
variables as aggressively and that in turn lets the relocation code see the
original variable.
In order for this to work with the asm streamer, there is now a dummy fragment
per section. It is used to assign a section to a symbol when no other fragment
exists.
This patch is a joint work by Maxim Ostapenko andy myself.
llvm-svn: 249303
Summary:
This is the first patch in the series to migrate Triple's (which are ambiguous)
to TargetTuple's (which aren't).
For the moment, TargetTuple simply passes all requests to the Triple object it
holds. Once it has replaced Triple, it will start to implement the interface in
a more suitable way.
This change makes some changes to the public C++ API. In particular,
InitMCSubtargetInfo(), createMCRelocationInfo(), and createMCSymbolizer()
now take TargetTuples instead of Triples. The other public C++ API's have
been left as-is for the moment to reduce patch size.
This commit also contains a trivial patch to clang to account for the C++ API
change. Thanks go to Pavel Labath for fixing LLDB for me.
Reviewers: rengolin
Subscribers: jyknight, dschuff, arsenm, rampitec, danalbert, srhines, javed.absar, dsanders, echristo, emaste, jholewinski, tberghammer, ted, jfb, llvm-commits, rengolin
Differential Revision: http://reviews.llvm.org/D10969
llvm-svn: 247692
Summary:
This is the first patch in the series to migrate Triple's (which are ambiguous)
to TargetTuple's (which aren't).
For the moment, TargetTuple simply passes all requests to the Triple object it
holds. Once it has replaced Triple, it will start to implement the interface in
a more suitable way.
This change makes some changes to the public C++ API. In particular,
InitMCSubtargetInfo(), createMCRelocationInfo(), and createMCSymbolizer()
now take TargetTuples instead of Triples. The other public C++ API's have
been left as-is for the moment to reduce patch size.
This commit also contains a trivial patch to clang to account for the C++ API
change.
Reviewers: rengolin
Subscribers: jyknight, dschuff, arsenm, rampitec, danalbert, srhines, javed.absar, dsanders, echristo, emaste, jholewinski, tberghammer, ted, jfb, llvm-commits, rengolin
Differential Revision: http://reviews.llvm.org/D10969
llvm-svn: 247683
We used to have this magic "hasLoadLinkedStoreConditional()" callback,
which really meant two things:
- expand cmpxchg (to ll/sc).
- expand atomic loads using ll/sc (rather than cmpxchg).
Remove it, and, instead, introduce explicit callbacks:
- bool shouldExpandAtomicCmpXchgInIR(inst)
- AtomicExpansionKind shouldExpandAtomicLoadInIR(inst)
Differential Revision: http://reviews.llvm.org/D12557
llvm-svn: 247429
With subregister liveness enabled we can detect the case where only
parts of a register are live in, this is expressed as a 32bit lanemask.
The current code only keeps registers in the live-in list and therefore
enumerated all subregisters affected by the lanemask. This turned out to
be too conservative as the subregister may also cover additional parts
of the lanemask which are not live. Expressing a given lanemask by
enumerating a minimum set of subregisters is computationally expensive
so the best solution is to simply change the live-in list to store the
lanemasks as well. This will reduce memory usage for targets using
subregister liveness and slightly increase it for other targets
Differential Revision: http://reviews.llvm.org/D12442
llvm-svn: 247171
with the new pass manager, and no longer relying on analysis groups.
This builds essentially a ground-up new AA infrastructure stack for
LLVM. The core ideas are the same that are used throughout the new pass
manager: type erased polymorphism and direct composition. The design is
as follows:
- FunctionAAResults is a type-erasing alias analysis results aggregation
interface to walk a single query across a range of results from
different alias analyses. Currently this is function-specific as we
always assume that aliasing queries are *within* a function.
- AAResultBase is a CRTP utility providing stub implementations of
various parts of the alias analysis result concept, notably in several
cases in terms of other more general parts of the interface. This can
be used to implement only a narrow part of the interface rather than
the entire interface. This isn't really ideal, this logic should be
hoisted into FunctionAAResults as currently it will cause
a significant amount of redundant work, but it faithfully models the
behavior of the prior infrastructure.
- All the alias analysis passes are ported to be wrapper passes for the
legacy PM and new-style analysis passes for the new PM with a shared
result object. In some cases (most notably CFL), this is an extremely
naive approach that we should revisit when we can specialize for the
new pass manager.
- BasicAA has been restructured to reflect that it is much more
fundamentally a function analysis because it uses dominator trees and
loop info that need to be constructed for each function.
All of the references to getting alias analysis results have been
updated to use the new aggregation interface. All the preservation and
other pass management code has been updated accordingly.
The way the FunctionAAResultsWrapperPass works is to detect the
available alias analyses when run, and add them to the results object.
This means that we should be able to continue to respect when various
passes are added to the pipeline, for example adding CFL or adding TBAA
passes should just cause their results to be available and to get folded
into this. The exception to this rule is BasicAA which really needs to
be a function pass due to using dominator trees and loop info. As
a consequence, the FunctionAAResultsWrapperPass directly depends on
BasicAA and always includes it in the aggregation.
This has significant implications for preserving analyses. Generally,
most passes shouldn't bother preserving FunctionAAResultsWrapperPass
because rebuilding the results just updates the set of known AA passes.
The exception to this rule are LoopPass instances which need to preserve
all the function analyses that the loop pass manager will end up
needing. This means preserving both BasicAAWrapperPass and the
aggregating FunctionAAResultsWrapperPass.
Now, when preserving an alias analysis, you do so by directly preserving
that analysis. This is only necessary for non-immutable-pass-provided
alias analyses though, and there are only three of interest: BasicAA,
GlobalsAA (formerly GlobalsModRef), and SCEVAA. Usually BasicAA is
preserved when needed because it (like DominatorTree and LoopInfo) is
marked as a CFG-only pass. I've expanded GlobalsAA into the preserved
set everywhere we previously were preserving all of AliasAnalysis, and
I've added SCEVAA in the intersection of that with where we preserve
SCEV itself.
One significant challenge to all of this is that the CGSCC passes were
actually using the alias analysis implementations by taking advantage of
a pretty amazing set of loop holes in the old pass manager's analysis
management code which allowed analysis groups to slide through in many
cases. Moving away from analysis groups makes this problem much more
obvious. To fix it, I've leveraged the flexibility the design of the new
PM components provides to just directly construct the relevant alias
analyses for the relevant functions in the IPO passes that need them.
This is a bit hacky, but should go away with the new pass manager, and
is already in many ways cleaner than the prior state.
Another significant challenge is that various facilities of the old
alias analysis infrastructure just don't fit any more. The most
significant of these is the alias analysis 'counter' pass. That pass
relied on the ability to snoop on AA queries at different points in the
analysis group chain. Instead, I'm planning to build printing
functionality directly into the aggregation layer. I've not included
that in this patch merely to keep it smaller.
Note that all of this needs a nearly complete rewrite of the AA
documentation. I'm planning to do that, but I'd like to make sure the
new design settles, and to flesh out a bit more of what it looks like in
the new pass manager first.
Differential Revision: http://reviews.llvm.org/D12080
llvm-svn: 247167
We can now run 32-bit programs with empty catch bodies. The next step
is to change PEI so that we get funclet prologues and epilogues.
llvm-svn: 246235
function.
This was the same as getFrameIndexReference, but without the FrameReg
output.
Differential Revision: http://reviews.llvm.org/D12042
llvm-svn: 245148
This commit removes the global manager variable which is responsible for
storing and allocating pseudo source values and instead it introduces a new
manager class named 'PseudoSourceValueManager'. Machine functions now own an
instance of the pseudo source value manager class.
This commit also modifies the 'get...' methods in the 'MachinePointerInfo'
class to construct pseudo source values using the instance of the pseudo
source value manager object from the machine function.
This commit updates calls to the 'get...' methods from the 'MachinePointerInfo'
class in a lot of different files because those calls now need to pass in a
reference to a machine function to those methods.
This change will make it easier to serialize pseudo source values as it will
enable me to transform the mips specific MipsCallEntry PseudoSourceValue
subclass into two target independent subclasses.
Reviewers: Akira Hatanaka
llvm-svn: 244693
Create wrapper methods in the Function class for the OptimizeForSize and MinSize
attributes. We want to hide the logic of "or'ing" them together when optimizing
just for size (-Os).
Currently, we are not consistent about this and rely on a front-end to always set
OptimizeForSize (-Os) if MinSize (-Oz) is on. Thus, there are 18 FIXME changes here
that should be added as follow-on patches with regression tests.
This patch is NFC-intended: it just replaces existing direct accesses of the attributes
by the equivalent wrapper call.
Differential Revision: http://reviews.llvm.org/D11734
llvm-svn: 243994
Remove some unnecessary explicit special members in Hexagon that, once
removed, allow the other implicit special members to be used without
depending on deprecated features.
llvm-svn: 243825
This patch does the following:
* Fix FIXME on `needsStackRealignment`: it is now shared between multiple targets, implemented in `TargetRegisterInfo`, and isn't `virtual` anymore. This will break out-of-tree targets, silently if they used `virtual` and with a build error if they used `override`.
* Factor out `canRealignStack` as a `virtual` function on `TargetRegisterInfo`, by default only looks for the `no-realign-stack` function attribute.
Multiple targets duplicated the same `needsStackRealignment` code:
- Aarch64.
- ARM.
- Mips almost: had extra `DEBUG` diagnostic, which the default implementation now has.
- PowerPC.
- WebAssembly.
- x86 almost: has an extra `-force-align-stack` option, which the default implementation now has.
The default implementation of `needsStackRealignment` used to just return `false`. My current patch changes the behavior by simply using the above shared behavior. This affects:
- AMDGPU
- BPF
- CppBackend
- MSP430
- NVPTX
- Sparc
- SystemZ
- XCore
- Out-of-tree targets
This is a breaking change! `make check` passes.
The only implementation of the `virtual` function (besides the slight different in x86) was Hexagon (which did `MF.getFrameInfo()->getMaxAlignment() > 8`), and potentially some out-of-tree targets. Hexagon now uses the default implementation.
`needsStackRealignment` was being overwritten in `<Target>GenRegisterInfo.inc`, to return `false` as the default also did. That was odd and is now gone.
Reviewers: sunfish
Subscribers: aemerson, llvm-commits, jfb
Differential Revision: http://reviews.llvm.org/D11160
llvm-svn: 242727
The standard containers are not designed to be inherited from, as
illustrated by the MSVC hacks for NodeOrdering. No functional change
intended.
llvm-svn: 242616
Summary:
This change is part of a series of commits dedicated to have a single
DataLayout during compilation by using always the one owned by the
module.
This patch is quite boring overall, except for some uglyness in
ASMPrinter which has a getDataLayout function but has some clients
that use it without a Module (llmv-dsymutil, llvm-dwarfdump), so
some methods are taking a DataLayout as parameter.
Reviewers: echristo
Subscribers: yaron.keren, rafael, llvm-commits, jholewinski
Differential Revision: http://reviews.llvm.org/D11090
From: Mehdi Amini <mehdi.amini@apple.com>
llvm-svn: 242386