The global entry point prologue currently assumes that the TOC
associated with a function is less than 2GB away from the function
entry point. This is always true when using the medium or small
code model, but may not be the case when using the large code model.
This patch adds a new variant of the ELFv2 global entry point prologue
that lifts the 2GB restriction when building with -mcmodel=large.
This works by emitting a quadword containing the distance from the
function entry point to its associated TOC immediately before the
entry point, and then using a prologue like:
ld r2,-8(r12)
add r2,r2,r12
Since creation of the entry point prologue is now split across two
separate routines (PPCLinuxAsmPrinter::EmitFunctionEntryLabel emits
the data word, PPCLinuxAsmPrinter::EmitFunctionBodyStart the prolog
code), I've switched to using named labels instead of just temporaries
to indicate the locations of the global and local entry points and the
new TOC offset data word.
These names are provided by new routines in PPCFunctionInfo modeled
after the existing PPCFunctionInfo::getPICOffsetSymbol.
Note that a corresponding change was committed to GCC here:
https://gcc.gnu.org/ml/gcc-patches/2015-12/msg00355.html
Reviewers: hfinkel
Differential Revision: http://reviews.llvm.org/D15500
llvm-svn: 257597
Summary:
With the ability to concatenate shader binaries, the limit of 15 no longer
applies.
Reviewers: tstellarAMD, arsenm
Subscribers: arsenm
Differential Revision: http://reviews.llvm.org/D16031
llvm-svn: 257592
Summary:
This allows Mesa to pass initial SPI_PS_INPUT_ADDR to LLVM.
The register assigns VGPR locations to PS inputs, while the ENA register
determines whether or not they are loaded.
Mesa needs to set some inputs as not-movable, so that a pixel shader prolog
binary appended at the beginning can assume where some inputs are.
v2: Make PSInputAddr private, because there is never enough silly getters
and setters for people to read.
Reviewers: tstellarAMD, arsenm
Subscribers: arsenm
Differential Revision: http://reviews.llvm.org/D16030
llvm-svn: 257591
Summary: ret.ll will contain a test for this
Reviewers: tstellarAMD, arsenm
Subscribers: arsenm
Differential Revision: http://reviews.llvm.org/D16029
llvm-svn: 257590
Make x86 OptimizeLEAs pass remove LEA instruction if there is another LEA
(in the same basic block) which calculates address differing only be a
displacement. Works only for -Oz.
Differential Revision: http://reviews.llvm.org/D13295
llvm-svn: 257589
This patch turns off the fast-math optimization attribute on the caller
if the callee's fast-math attribute is not turned on.
For example,
- before inlining
caller: "less-precise-fpmad"="true"
callee: "less-precise-fpmad"="false"
- after inlining
caller: "less-precise-fpmad"="false"
Alternatively, it's possible to block inlining if the caller's and
callee's attributes don't match. If this approach is preferable to the
one in this patch, we can discuss post-commit.
rdar://problem/19836465
Differential Revision: http://reviews.llvm.org/D7802
llvm-svn: 257575
Summary:
The problem here is that an enum class can not be implicitly converted to an
integer. That assumption snuck back into PointerIntPair. This commit fixes the
issue and more importantly adds some unittests to make sure that we do not break
this again.
rdar://23594806
Reviewers: gribozavr
Subscribers: llvm-commits
Differential Revision: http://reviews.llvm.org/D16131
llvm-svn: 257574
AnalyzeBranch on X86 (and, previously, SPARC, which implementation was
copied from X86) tries to modify the branches based on block
layout (e.g. checking isLayoutSuccessor), when AllowModify is true.
The rest of the architectures leave that up to the caller, which can
call InsertBranch, RemoveBranch, and ReverseBranchCondition as
appropriate. That appears to be the preferred way to do it nowadays.
This commit makes SPARC like the rest: replaces AnalyzeBranch with an
implementation cribbed from AArch64, and adds a ReverseBranchCondition
implementation.
Additionally, a test-case has been added (also cribbed from AArch64)
demonstrating that redundant branch sequences no longer get emitted.
E.g., it used to emit code like this:
bne .LBB1_2
nop
ba .LBB1_1
nop
.LBB1_2:
And now emits:
cmp %i0, 42
be .LBB1_1
nop
llvm-svn: 257572
(Resubmit after fixing a typo that breaks test on big endian
machines)
In this refactoring, member functions are introduced to access
CovMap header/func record members and hide layout details. This
will enable further code restructuring to support reading multiple
versions of coverage mapping data with shared/templatized code.
(When coveremap format version changes, backward compatibtility
should be preserved).
llvm-svn: 257571
When we arrive at the end of the function, the validation of
the object has been done already. In theory, so, we should never
arrive here with something broken as the object isn't mutated.
Practice sometimes proves theory to be wrong, so leave an assertion
instead, as suggested by David Blaikie, to catch bugs.
llvm-svn: 257570
The version numbers of the darwin kernel are different from the version
numbers of OS X, so we need adjustments if we had "*-*-darwin" triples.
Use the existing utility functions in TargetTriple for this.
Fixes rdar://22056966
Differential Revision: http://reviews.llvm.org/D14601
llvm-svn: 257555
The line tables for CodeView make a distinction between expressions and
statements. As it turns out, MSVC always emits them as statements and
we always emit them as expressions. Let's switch to statements to match
the CodeView that they emit.
llvm-svn: 257553
This change has us print out fields we didn't previously understand. To
improve readability, we now group column information with it's
respective line.
llvm-svn: 257552
(Resubmit after fixing build bot failures)
In this refactoring, member functions are introduced to access
CovMap header/func record members and hide layout details. This
will enable further code restructuring to support reading multiple
versions of coverage mapping data with shared/templatized code.
(When coveremap format version changes, backward compatibtility
should be preserved).
llvm-svn: 257551
The follow extra changes were made to test cases:
Manually making the variable be the actual type instead of a pointer
to avoid pointer-size differences in generic code:
LLVM :: DebugInfo/Generic/2010-03-24-MemberFn.ll
LLVM :: DebugInfo/Generic/2010-04-06-NestedFnDbgInfo.ll
LLVM :: DebugInfo/Generic/2010-05-03-DisableFramePtr.ll
LLVM :: DebugInfo/Generic/varargs.ll
Delete sizing information from debug info for the same reason
(but the presence of the pointer was important to the test case):
LLVM :: DebugInfo/Generic/restrict.ll
LLVM :: DebugInfo/Generic/tu-composite.ll
LLVM :: Linker/type-unique-type-array-a.ll
LLVM :: Linker/type-unique-simple2.ll
Fixing an incorrect DW_OP_deref
LLVM :: DebugInfo/Generic/2010-05-03-OriginDIE.ll
Fixing a missing DW_OP_deref
LLVM :: DebugInfo/Generic/incorrect-variable-debugloc.ll
Additionally, clang should no longer complain during bootstrap should no
longer happen after r257534.
The original commit message was:
```
Summary:
Teach the Verifier to make sure that the storage size given to llvm.dbg.declare
or the value size given to llvm.dbg.value agree with what is declared in
DebugInfo. This is implicitly assumed in a number of passes (e.g. in SROA).
Additionally this catches a number of common mistakes, such as passing a
pointer when a value was intended or vice versa.
One complication comes from stack coloring which modifies the original IR when
it merges allocas in order to make sure that if AA falls back to the IR it gets
the correct result. However, given this new invariant, indiscriminately
replacing one alloca by a different (differently sized one) is no longer valid.
Fix this by just undefing out any use of the alloca in a dbg.declare in this
case.
Additionally, I had to fix a number of test cases. Of particular note:
- I regenerated dbg-changes-codegen-branch-folding.ll from the given source as
it was affected by the bug fixed in r256077
- two-cus-from-same-file.ll was changed to avoid having a variable-typed debug
variable as that would depend on the target, even though this test is
supposed to be generic
- I had to manually declared size/align for reference type. See also the
discussion for D14275/r253186.
- fpstack-debuginstr-kill.ll required changing `double` to `long double`
- most others were just a question of adding OP_deref
```
llvm-svn: 257550
to only print the first private header.
Which for Mach-O files only prints the Mach header and not the subsequent load
commands. Which is used by scripts to match what the darwin otool(1) with the
-h flag does without the -l flag.
For non-Mach-O files it has the same functionality as -private-headers (with
the trailing ’s’).
rdar://24158331
llvm-svn: 257548
In this refactoring, member functions are introduced to access
CovMap header/func record members and hide layout details. This
will enable further code restructuring to support reading multiple
versions of coverage mapping data with shared/templatized code.
(When coveremap format version changes, backward compatibtility
should be preserved).
llvm-svn: 257547
Summary:
BFC instructions are available in ARMv6T2 and above.
Reviewers: t.p.northover
Subscribers: aemerson
Differential Revision: http://reviews.llvm.org/D16076
llvm-svn: 257546
VMOVs are not strictly speaking cheap, but they are as expensive as a vector
copy (VORR), so we should prefer rematerialization over splitting when it
applies.
rdar://problem/23754176
llvm-svn: 257545
Previously the RegisterOperands have only been used internally in
RegisterPressure.cpp. However this datastructure can be useful for other
tasks as well and allows refactoring of PDiff initialisation out of
RPTracker::recede().
This patch:
- Exposes RegisterOperands as public API
- Splits RPTracker::recede() into a part that skips DebugValues and
maintains the region borders, and the core that changes register
pressure when given a set of RegisterOperands.
- This allows to move the PDiff initialisation out recede() into a
method of the PressureDiffs class.
- The upcoming subregister scheduling code will also use
RegisterOperands to avoid pushing more unrelated functionality into
recede()/advance().
Differential Revision: http://reviews.llvm.org/D15473
llvm-svn: 257535
Summary: The dbg.declare -> dbg.value conversion looks through any zext/sext
to find a value to describe the variable (in the expectation that those
zext/sext instruction will go away later). However, those values do not
cover the entire variable and thus need a DW_OP_bit_piece.
Reviewers: aprantl
Subscribers: llvm-commits
Differential Revision: http://reviews.llvm.org/D16061
llvm-svn: 257534
Summary: Add SaturatingMultiplyAdd convenience function template since A + (X * Y) comes up frequently when doing weighted arithmetic.
Reviewers: davidxl, silvas
Subscribers: llvm-commits
Differential Revision: http://reviews.llvm.org/D15385
llvm-svn: 257532
CodeView, unlike DWARF, can associate code with a range of columns.
However, LLVM can only represent a single column position internally.
We used to claim that the end column and start column were the same
which yielded less than satisfactory results: we would stop printing at
the _beginning_ of the source expression! Instead, mark the column-end
as 'zero' to indicate that we don't have one (as per the documentation
for IDiaLineNumber::get_lineNumberEnd).
llvm-svn: 257528
Only non-weighted predicates were handled in PPCInstrInfo::insertSelect. Handle
the weighted predicates as well.
This latent bug was triggered by r255398, because it added use of the
branch-weighted predicates.
While here, switch over an enum instead of an int to get the compiler to enforce
totality in the future.
llvm-svn: 257518
A request has been made to the official registry, but an official value is
not yet available. This patch uses a temporary value in order to support
development. When an official value is recieved, the value of EM_WEBASSEMBLY
will be updated.
llvm-svn: 257517
Refactor .param, .result, .local, and .endfunc, as directives, using the
proper MCTargetStreamer mechanism, rather than fake instructions.
llvm-svn: 257511
This patch changes the way labels are referenced. Instead of referencing the
basic-block label name (eg. .LBB0_0), instructions now just have an immediate
which indicates the depth in the control-flow stack to find a label to jump to.
This makes them much closer to what we expect to have in the binary encoding,
and avoids the problem of basic-block label names not being explicit in the
binary encoding.
Also, it terminates blocks and loops with end_block and end_loop instructions,
rather than basic-block label names, for similar reasons.
This will also fix problems where two constructs appear to have the same label,
because we no longer explicitly use labels, so consumers that need labels will
presumably create their own labels, and presumably they won't reuse labels
when they do.
This patch does make the code a little more awkward to read; as a partial
mitigation, this patch also introduces comments showing where the labels are,
and comments on each branch showing where it's branching to.
llvm-svn: 257505
The findExternalCalls routine ignores calls to functions already
defined in the dest module. This was not handling the case where
the definition in the current module is actually an alias to a
function call.
llvm-svn: 257493
This is a very limited implementation of DFG-based copy propagation.
It only handles actual COPY instructions (does not handle other equivalents
such as add-immediate with a 0 operand).
The major limitation is that it does not update the DFG: that will be the
change required to make it more robust (hopefully coming up soon).
llvm-svn: 257490
Prepatory patch before changing LibCallSimplifier to use the FMF.
Also, tighten the CHECK lines and give the tests more meaningful names.
Similar changes to:
http://reviews.llvm.org/rL257414
llvm-svn: 257481
Summary: The result register is the second operand as per the other mt* instructions.
Reviewers: vkalintiris
Subscribers: llvm-commits, dsanders
Differential Revision: http://reviews.llvm.org/D15993
llvm-svn: 257478
Target independent, SSA-based data flow framework for representing
data flow between physical registers.
This commit implements the creation of the actual data flow graph.
llvm-svn: 257477
Summary:
This fixes three bugs, in all of which state is not or incorrecly reset between
objects (i.e. when reusing the same pass manager to create multiple object
files):
1) AttributeSection needs to be reset to nullptr, because otherwise the backend
will try to emit into the old object file's attribute section causing a
segmentation fault.
2) MappingSymbolCounter needs to be reset, otherwise the second object file
will start where the first one left off.
3) The MCStreamer base class resets the Streamer's e_flags settings. Since
EF_ARM_EABI_VER5 is set on streamer creation, we need to set it again
after the MCStreamer was rest.
Also rename Reset (uppser case) to EHReset to avoid confusion with
reset (lower case).
Reviewers: rengolin
Differential Revision: http://reviews.llvm.org/D15950
llvm-svn: 257473
(64 to 128-bit) matches against the pattern fragment 'vzmovl_v2i64'
(a zero-extended 64-bit load).
However, a change in r248784 teaches the instruction combiner that only
the lower 64 bits of the input to a 128-bit vcvtph2ps are used. This means
the instruction combiner will ordinarily optimize away the upper 64-bit
insertelement instruction in the zero-extension and so we no longer select
the memory-register form. To fix this a new pattern has been added.
Differential Revision: http://reviews.llvm.org/D16067
llvm-svn: 257470
This means that the DEBUG_TYPE cannot take a comma anymore. All existing passes
conform to this rule.
Differential Revision: http://reviews.llvm.org/D15645
llvm-svn: 257466
With this, one can build a lib from the objects of other libs:
set(SOURCES
$<TARGET_OBJECTS:obj.clingInterpreter>
$<TARGET_OBJECTS:obj.clingMetaProcessor>
$<TARGET_OBJECTS:obj.clingUtils>
)
Reviewed by Chris Bieneman - thanks!
llvm-svn: 257459
handlers.
It is expected that RPC handlers will usually be member functions. Accepting them
directly in handle and expect allows for the remove of a lot of lambdas an
explicit error variables.
This patch also uses this new feature to substantially tidy up the
OrcRemoteTargetServer class.
llvm-svn: 257452
Since function definitions are not loaded into the address space, PT_LOAD is
inappropriate. PT_WEBASSEMBLY_FUNCTIONS is used to identify where the function
definitions are so that they can be processed at program startup time.
llvm-svn: 257436