Add a common parent `ConstantData` to the constants that have no
operands. These are guaranteed to represent abstract data that is in no
way tied to a specific Module.
This is a good cleanup on its own. It also makes it simpler to disallow
RAUW (and factor away use-lists) on these constants in the future. (I
have some experimental patches that make RAUW illegal on ConstantData,
and they seem to catch a bunch of bugs...)
llvm-svn: 261464
COFF doesn't have sections with mergeable contents. Instead, each
constant pool entry ends up in a COMDAT section. The linker, when
choosing between COMDAT sections, doesn't choose the max alignment of
the two sections. You just get whatever alignment was on the section.
If one constant needed a higher alignment in one object file from
another one, then we will get into trouble if the linker chooses the
lower alignment one.
Instead, lets promote the alignment of the constant pool entry to make
sure we don't use an under aligned constant with an instruction which
assumed otherwise.
This fixes PR26680.
llvm-svn: 261462
The stack pointer is bumped when there is a frame pointer or when there
are static-size objects, but was only getting written back when there
were static-size objects.
llvm-svn: 261453
TailDuplicate can run on either on SSA code or non-SSA code, as indicated to
it by MRI->isSSA() ("PreRegAlloc" here). TailDuplicate does extra work to
preserve SSA invariants when it duplicates code. This patch makes it skip
some of this extra work in the case where the code is not in SSA form.
llvm-svn: 261450
The patch has a necessary call to a function inside an assert. Which is fine
when you have asserts turned on. Not so much when they're off. Sorry about
the regression.
llvm-svn: 261447
This patch corresponds to review:
http://reviews.llvm.org/D17294
It ensures that whatever block we are emitting the prologue/epilogue into, we
have the necessary scratch registers. It takes away the hard-coded register
numbers for use as scratch registers as registers that are guaranteed to be
available in the function prologue/epilogue are not guaranteed to be available
within the function body. Since we shrink-wrap, the prologue/epilogue may end
up in the function body.
llvm-svn: 261441
As discussed on PR24580, this patch adds some (more to come) initial fast-isel codegen tests to match the IR generated in clang/test/CodeGen/sse41-builtins.c
llvm-svn: 261438
Fixed a bug introduced by D16683 when a binary shuffle is simplified to a unary shuffle (with undef/zero sentinel mask indices) - if this resulted in only the second input being used combineX86ShuffleChain failed to take this into account and still referenced the first input.
llvm-svn: 261434
First small step towards fixing PR26667 - we need to ensure that combineX86ShuffleChain only gets called with a valid shuffle input node (a similar issue was found in D17041).
llvm-svn: 261433
the algorithm easily degrades into quadratic memory and time complexity.
The easiest example is a long chain of BBs that don't otherwise use a
location. The caching will add an entry for every intermediate block and
limiting the number of results doesn't help as no results are produced
until a definition is found.
Introduce a limit similar to the existing instructions-per-block limit.
This limit counts the total number of blocks checked. If the limit is
reached, entries are considered unknown. The initial value is 1000,
which avoids regressions for normal sized functions while still
limiting edge cases to reasnable memory consumption and execution time.
Differential Revision: http://reviews.llvm.org/D16123
llvm-svn: 261430
No functional change intended. Copying small (<= 64 bits) APInts isn't
expensive but bloats code by generating the slow path everywhere. Moving
doesn't care about the size of the value.
llvm-svn: 261426
This avoids unnecessarily passing them around when calling helper
functions. It may also be slightly faster to call clear() on the
datastructures instead of freshly initializing them for each block.
llvm-svn: 261407
it to actually test the new pass manager AA wiring.
This patch was extracted from the (somewhat too large) D12357 and
rebosed on top of the slightly different design of the new pass manager
AA wiring that I just landed. With this we can start testing the AA in
a thorough way with the new pass manager.
Some minor cleanups to the code in the pass was necessitated here, but
otherwise it is a very minimal change.
Differential Revision: http://reviews.llvm.org/D17372
llvm-svn: 261403
Cleanuppads may be merged together if one is the only predecessor of the
other in which case a simple transform can be performed: replace the
a cleanupret with a branch and remove an unnecessary cleanuppad.
Differential Revision: http://reviews.llvm.org/D17459
llvm-svn: 261390
TLSADDR nodes are lowered into actuall calls inside MC. In order to prevent
shrink-wrapping from pushing prologue/epilogue past them (which result
in TLS variables being accessed before the stack frame is set up), we
put markers, so that the stack gets adjusted properly.
Thanks to Quentin Colombet for guidance/help on how to fix this problem!
llvm-svn: 261387
Summary:
Instead of trying to replace SMRD instructions with a VGPR base pointer
with an equivalent MUBUF instruction, we now copy the base pointer to
SGPRs using v_readfirstlane.
This is safe to do, because any load selected as an SMRD instruction
has been proven to have a uniform base pointer, so each thread in the
wave will have the same pointer value in VGPRs.
This will fix some errors on VI from trying to replace SMRD instructions
with addr64-enabled MUBUF instructions that don't exist.
Reviewers: arsenm, cfang, nhaehnle
Subscribers: arsenm, llvm-commits
Differential Revision: http://reviews.llvm.org/D17305
llvm-svn: 261385