If vector types have legal register classes, then LLVM bypasses LegalizeTypes
on them, which causes faults currently since the code to handle them isn't in
place.
This fixes test failures when AArch64 is the default target.
llvm-svn: 175172
The parser will now accept instructions with alignment specifiers written like
vld1.8 {d16}, [r0:64]
, while also still accepting the incorrect syntax
vld1.8 {d16}, [r0, :64]
llvm-svn: 175164
up so that we can apply the direct object emitter patch. This patch
should be a nop right now and it's test is to not break what is already
there.
llvm-svn: 175126
of the copy is a subregister def. The current code assumes that it can do a full
def of the destination register, but it is not checking that the def operand is
read-undef. It also doesn't clear the subregister index of the destination in
the new instruction to reflect the full subregister def.
These issues were found running 'make check' with my next commit that enables
rematerialization in more cases.
llvm-svn: 175122
Since functions with internal linkage don't have language linkage, it is valid
to overload them:
extern "C" {
static int foo();
static int foo(int);
}
So we mangle them.
llvm-svn: 175120
It's possible (e.g. after an LTO build) that an internal global may be used for
debugging purposes. If that's the case appending a '.b' to it makes it hard to
find that variable. Steal the name from the old GV before deleting it so that
they can find that variable again.
llvm-svn: 175104
if the offset fits in 11 bits. This makes use of the fact that the abi
requires sp to be 8 byte aligned so the actual offset can fit in 8
bits. It will be shifted left and sign extended before being actually used.
The assembler or direct object emitter will shift right the 11 bit
signed field by 3 bits. We don't need to deal with that here.
llvm-svn: 175073
This happens when there is both stack realignment and a dynamic alloca in the
function. If we overwrite %esi (rep;movsl uses fixed registers) we'll lose the
base pointer and the next register spill will write into oblivion.
Fixes PR15249 and unbreaks firefox on i386/freebsd. Mozilla uses dynamic allocas
and freebsd a 4 byte stack alignment.
llvm-svn: 175057
RegisterCoalescer used to depend on LiveDebugVariable. LDV removes DBG_VALUEs
without emitting them at the end.
We fix this by removing LDV from RegisterCoalescer. Also add an assertion to
make sure we call emitDebugValues if DBG_VALUEs are removed at
runOnMachineFunction.
rdar://problem/13183203
Reviewed by Andy & Jakob
llvm-svn: 175023
This is complicated by backward labels (e.g., 0b can be both a backward label
and a binary zero). The current implementation assumes [0-9]b is always a
label and thus it's possible for 0b and 1b to not be interpreted correctly for
ms-style inline assembly. However, this is relatively simple to fix in the
inline assembly (i.e., drop the [bB]).
This patch also limits backward labels to [0-9]b, so that only 0b and 1b are
ambiguous.
Part of rdar://12470373
llvm-svn: 174983
DAGCombiner::ReduceLoadWidth was converting (trunc i32 (shl i64 v, 32))
into (shl i32 v, 32) into undef. To prevent this, check the shift count
against the final result size.
Patch by: Kevin Schoedel
Reviewed by: Nadav Rotem
llvm-svn: 174972
Vectors were being manually scalarized by the backend. Instead,
let the target-independent code do all of the work. The manual
scalarization was from a time before good target-independent support
for scalarization in LLVM. However, this forces us to specially-handle
vector loads and stores, which we can turn into PTX instructions that
produce/consume multiple operands.
llvm-svn: 174968
'R600/SI: Use proper instructions for array/shadow samplers.' removed two
cases from TEX_SHADOW. Vincent Lejeune reported on IRC that this broke some
shadow array piglit tests with the r600g driver. Reinstating the removed
cases should fix this, and still works with radeonsi as well.
I will follow up with some lit tests which would have caught the regression.
NOTE: This is a candidate for the Mesa stable branch.
Tested-by: Vincent Lejeune <vljn@ovi.com>
Reviewed-by: Tom Stellard <thomas.stellard@amd.com>
llvm-svn: 174963
The bitcode writer emits a reference to the attribute group that the object at
the given index refers to. The bitcode reader is modified to read this in and
map it back to the attribute group.
llvm-svn: 174952
live ranges should always be extended, and the only successor that should be
considered for extension of other ranges is the target of the split edge.
llvm-svn: 174935
Sorry for the lack of a test case. I tried writing one for i386 as i know selects are illegal on this target, but they are actually considered legal by isel and expanded later.
I can't see any targets to trigger this, but checking for the legality of a node before forming it is general goodness.
llvm-svn: 174934
Check for reverse shuffles in the CostModel analysis pass and query
TargetTransform info accordingly. This allows us we can write test cases for
reverse shuffles.
radar://13171406
llvm-svn: 174932
Lower reverse shuffles to a vrev64 and a vext instruction instead of the default
legalization of storing and loading to the stack. This is important because we
generate reverse shuffles in the loop vectorizer when we reverse store to an
array.
uint8_t Arr[N];
for (i = 0; i < N; ++i)
Arr[N - i - 1] = ...
radar://13171760
llvm-svn: 174929
When building the pairable-instruction dependency map, don't search
past the last pairable instruction. For large blocks that have been
divided into multiple instruction groups, searching past the last
instruction in each group is very wasteful. This gives a 32% speedup
on the csa.ll test case from PR15222 (when using 50 instructions
in each group).
No functionality change intended.
llvm-svn: 174915
This map is queried only for instructions in pairs of pairable
instructions; so make sure that only pairs of pairable
instructions are added to the map. This gives a 3.5% speedup
on the csa.ll test case from PR15222.
No functionality change intended.
llvm-svn: 174914
MipsCodeEmitter.cpp.
JALR and NOP are expanded by function emitPseudoExpansionLowering, which is not
called when the old JIT is used.
This fixes the following tests which have been failing on
llvm-mips-linux builder:
LLVM :: ExecutionEngine__2003-01-04-LoopTest.ll
LLVM :: ExecutionEngine__2003-05-06-LivenessClobber.ll
LLVM :: ExecutionEngine__2003-06-04-bzip2-bug.ll
LLVM :: ExecutionEngine__2005-12-02-TailCallBug.ll
LLVM :: ExecutionEngine__2003-10-18-PHINode-ConstantExpr-CondCode-Failure.ll
LLVM :: ExecutionEngine__hello2.ll
LLVM :: ExecutionEngine__stubs.ll
LLVM :: ExecutionEngine__test-branch.ll
LLVM :: ExecutionEngine__test-call.ll
LLVM :: ExecutionEngine__test-common-symbols.ll
LLVM :: ExecutionEngine__test-loadstore.ll
LLVM :: ExecutionEngine__test-loop.ll
llvm-svn: 174912
This eliminates one more linear search over a range of
std::multimap entries. This gives a 22% speedup on the
csa.ll test case from PR15222.
No functionality change intended.
llvm-svn: 174893
The modifiers don't seem to have any effect with V_MOV_B32, supposedly it's
meant to just move bits untouched.
Fixes 46 piglit tests with radeonsi, though unfortunately 11 of those had
just regressed because they started using the clamp modifier.
NOTE: This is a candidate for the Mesa stable branch.
Reviewed-by: Tom Stellard <thomas.stellard@amd.com>
llvm-svn: 174890
This flag makes asan use a small (<2G) offset for 64-bit asan shadow mapping.
On x86_64 this saves us a register, thus achieving ~2/3 of the
zero-base-offset's benefits in both performance and code size.
Thanks Jakub Jelinek for the idea.
llvm-svn: 174886
This does two things:
It removes a call to abs() which may have "long long" parameter on Windows,
which is not necessarily available in C++03.
It also corrects the signedness of Amount, which was relying on
implementation-defined conversions previously.
Code was already tested (albeit in an implemnetation defined way) so no extra
tests.
llvm-svn: 174885
Previous code had a confusing comment which was mostly an implementation
detail. This condition corresponds to "lsb up to register width" and "width not
ridiculous".
llvm-svn: 174877
This gives a DiagnosticType to all AsmOperands in sight. This replaces all
"invalid operand" diagnostics with something more specific. The messages given
should still be sufficiently vague that they're not usually actively misleading
when LLVM guesses your instruction incorrectly.
llvm-svn: 174871
This is currently a bit hairier than it needs to be, since depending on where the
split block resides the end ListEntry of the split block may be the end ListEntry
of the original block or a new entry. Some changes to the SlotIndexes updating
should make it possible to eliminate the two cases here.
This also isn't as optimized as it could be. In the future Liveinterval should
probably get a flag that indicates whether the LiveInterval is within a single
basic block. We could ignore all such intervals when splitting an edge.
llvm-svn: 174870
This emits the attribute groups that are used by the functions. (It currently
doesn't print out return type or parameter attributes within attribute groups.)
Note: The functions still retrieve their attributes from the "old" bitcode
format (using the deprecated 'Raw()' method). This means that string attributes
within an attribute group will not show up during a disassembly. This will be
addressed in a future commit.
llvm-svn: 174867
This reverts my commit 171047. Now that I've removed my misguided attempt to
support backend warnings, these diagnostics are only about inline assembly.
It would take quite a bit more work to generalize them properly, so I'm
just reverting this.
llvm-svn: 174860
This removes the last of the linear searches over ranges of std::multimap
iterators, giving a 7% speedup on the doduc.bc input from PR15222.
No functionality change intended.
llvm-svn: 174859
Profiling suggests that getInstructionTypes is performance-sensitive,
this cleans up some double-casting in that function in favor of
using dyn_cast.
No functionality change intended.
llvm-svn: 174857
By itself, this does not have much of an effect, but only because in the default
configuration the full cycle checks are used only for small problem sizes.
This is part of a general cleanup of uses of iteration over std::multimap
ranges only for the purpose of checking membership.
No functionality change intended.
llvm-svn: 174856
function is successfully handled by fast-isel. That's because function
arguments are *always* handled by SDISel. Introduce FastLowerArguments to
allow each target to provide hook to handle formal argument lowering.
As a proof-of-concept, add ARMFastIsel::FastLowerArguments to handle
functions with 4 or fewer scalar integer (i8, i16, or i32) arguments. It
completely eliminates the need for SDISel for trivial functions.
rdar://13163905
llvm-svn: 174855
I have some uncommitted changes to the cast code that catch this sort of thing
at compile-time but I still need to do some other cleanup before I can enable
it.
llvm-svn: 174853
support for updating SlotIndexes to MachineBasicBlock::SplitCriticalEdge(). This
calls renumberIndexes() every time; it should be improved to only renumber
locally.
llvm-svn: 174851
This reads the attribute groups. It currently doesn't do anything with them.
NOTE: In the commit to the bitcode writer, the format *may* change in the near
future. Which means that this code would also change.
llvm-svn: 174849
This is some initial code for emitting the attribute groups into the bitcode.
NOTE: This format *may* change! Do not rely upon the attribute groups' bitcode
not changing.
llvm-svn: 174845
present, it currently verifies them with the MachineVerifier, and this passed
all of the test cases in 'make check' (when accounting for existing verifier
errors). There were some assertion failures in the two-address pass, but they
also happened on code without phis and look like they are caused by different
kill flags from LiveIntervals.
The only part that doesn't work is the critical edge splitting heuristic,
because there isn't currently an efficient way to update LiveIntervals after
splitting an edge. I'll probably start by implementing the slow fallback and
test that it works before tackling the fast path for single-block ranges. The
existing code that updates LiveVariables is fairly slow as it is.
There isn't a command-line option for enabling this; instead, just edit
PHIElimination.cpp to require LiveIntervals.
llvm-svn: 174831
The original syntax for the attribute groups was ambiguous. For example:
declare void @foo() #1#0 = attributes { noinline }
The '#0' would be parsed as an attribute reference for '@foo' and not as a
top-level entity. In order to continue forward while waiting for a decision on
what the correct syntax is, I'm changing it to this instead:
declare void @foo() #1
attributes #0 = { noinline }
Repeat: This is TEMPORARY until we decide what the correct syntax should be.
llvm-svn: 174813
bitcode writer would generate abbrev records saying that the abbrev should be
filled with fixed zero-bit bitfields (this happens in the .bc writer when
the number of types used in a module is exactly one, since log2(1) == 0).
In this case, just handle it as a literal zero. We can't "just fix" the writer
without breaking compatibility with existing bc files, so have the abbrev reader
do the substitution.
Strengthen the assert in read to reject reads of zero bits so we catch such
crimes in the future, and remove the special case designed to handle this.
llvm-svn: 174801
This uses a liveness algorithm that does not depend on data from the
LiveVariables analysis, it is the first step towards removing
LiveVariables completely.
llvm-svn: 174774
This fixes a couple of bugs and incorrect assumptions,
in total four more piglit tests now pass.
v2: fix small bug in the dominator updating
Patch by: Christian König
Signed-off-by: Christian König <christian.koenig@amd.com>
llvm-svn: 174762
Patch by: Christian König
Intersecting loop handling was wrong.
Signed-off-by: Christian König <christian.koenig@amd.com>
Tested-by: Michel Dänzer <michel.daenzer@amd.com>
llvm-svn: 174761
Otherwise we sometimes produce invalid code.
Patch by: Christian König
Signed-off-by: Christian König <christian.koenig@amd.com>
Tested-by: Michel Dänzer <michel.daenzer@amd.com>
llvm-svn: 174760
This reverts r171041. This was a nice idea that didn't work out well.
Clang warnings need to be associated with warning groups so that they can
be selectively disabled, promoted to errors, etc. This simplistic patch didn't
allow for that. Enhancing it to provide some way for the backend to specify
a front-end warning type seems like overkill for the few uses of this, at
least for now.
llvm-svn: 174748
same so we put in the comment field an indicator when we think we are
emitting the 16 bit version. For the direct object emitter, the difference is
important as well as for other passes which need an accurate count of
program size. There will be other similar putbacks to this for various
instructions.
llvm-svn: 174747
Previously, even when a pre-increment load or store was generated,
we often needed to keep a copy of the original base register for use
with other offsets. If all of these offsets are constants (including
the offset which was combined into the addressing mode), then this is
clearly unnecessary. This change adjusts these other offsets to use the
new incremented address.
llvm-svn: 174746
This is a follow-up to the cost-model change in r174713 which splits
the cost of a memory operation between the address computation and the
actual memory access. In r174713, this cost is always added to the
memory operation cost, and so BBVectorize will do the same.
Currently, this new cost function is used only by ARM, and I don't
have any ARM test cases for BBVectorize. Assistance in generating some
good ARM test cases for BBVectorize would be greatly appreciated!
llvm-svn: 174743
Aside from the question of whether we report a warning or an error when we
can't satisfy a requested stack object alignment, the current implementation
of this is not good. We're not providing any source location in the diagnostics
and the current warning is not connected to any warning group so you can't
control it. We could improve the source location somewhat, but we can do a
much better job if this check is implemented in the front-end, so let's do that
instead. <rdar://problem/13127907>
llvm-svn: 174741
Thanks to help from Nadav and Hal, I have a more reasonable (and even
correct!) approach. This specifically penalizes the insertelement
and extractelement operations for the performance hit that will occur
on PowerPC processors.
llvm-svn: 174725
isn't using the default calling convention. However, if the transformation is
from a call to inline IR, then the calling convention doesn't matter.
rdar://13157990
llvm-svn: 174724
of lines which weren't being explicitly looked at and were printing incorrect results. These
values clearly must lie within 32 bits, so the casts are definitely safe.
llvm-svn: 174717
Adds a function to target transform info to query for the cost of address
computation. The cost model analysis pass now also queries this interface.
The code in LoopVectorize adds the cost of address computation as part of the
memory instruction cost calculation. Only there, we know whether the instruction
will be scalarized or not.
Increase the penality for inserting in to D registers on swift. This becomes
necessary because we now always assume that address computation has a cost and
three is a closer value to the architecture.
radar://13097204
llvm-svn: 174713
Attribute references are of this form:
define void @foo() #0#1#2 { ... }
Parse them for function attributes. If there's more than one reference, then
they are merged together.
llvm-svn: 174697
allowed size for the instruction. This code uses RegScavenger to fix this.
We sometimes need 2 registers for Mips16 so we must handle things
differently than how register scavenger is normally used.
llvm-svn: 174696
Add #include <unistd.h> to OProfileWrapper.cpp. This provides the declarations for 'read' and 'close' that are otherwise missing, and result in 'error: <foo> was not declared in this scope'.
This matches the issue as reported in bug 15055 "Can no longer compile LLVM with --with-oprofile"
llvm-svn: 174661
Certain vector operations don't vectorize well with the current
PowerPC implementation. Element insert/extract performs poorly
without VSX support because Altivec requires going through memory.
SREM, UREM, and VSELECT all produce bad scalar code.
There's a lot of work to do for the cost model before
autovectorization will be tuned well, and this is not an attempt to
address the larger problem.
llvm-svn: 174660
Remove all the unused code.
Patch by: Christian König
Signed-off-by: Christian König <christian.koenig@amd.com>
Reviewed-by: Tom Stellard <thomas.stellard@amd.com>
llvm-svn: 174656
Allows nexuiz to run with radeonsi.
Patch by: Michel Dänzer
Signed-off-by: Michel Dänzer <michel.daenzer@amd.com>
Reviewed-by: Tom Stellard <thomas.stellard@amd.com>
llvm-svn: 174655
20 more little piglits with radeonsi.
Patch by: Michel Dänzer
Signed-off-by: Michel Dänzer <michel.daenzer@amd.com>
Reviewed-by: Tom Stellard <thomas.stellard@amd.com>
llvm-svn: 174654
The _SGPR variants where wrong.
Patch by: Christian König
Signed-off-by: Christian König <christian.koenig@amd.com>
Reviewed-by: Tom Stellard <thomas.stellard@amd.com>
llvm-svn: 174653
v2: rebased on current upstream
Patch by: Christian König
Signed-off-by: Christian König <christian.koenig@amd.com>
Reviewed-by: Tom Stellard <thomas.stellard@amd.com>
llvm-svn: 174652
This is for the case when no processor is passed to the backend. This
prevents the
'' is not a recognized processor for this target (ignoring processor)
warning from being generated by clang.
llvm-svn: 174651
We don't want too many classes in a pass and the classes obscure the details. I
was going a little overboard with object modeling here. Replace classes by
generic code that handles both loads and stores.
No functionality change intended.
llvm-svn: 174646
Handle vectors of 1 to 16 integers.
Change the intrinsic names to prevent the wrong one from being selected at
runtime due to the overloading.
Patch By: Michel Dänzer
Signed-off-by: Michel Dänzer <michel.daenzer@amd.com>
Reviewed-by: Tom Stellard <thomas.stellard@amd.com>
llvm-svn: 174633
v1i32, v2i32, v8i32 and v16i32.
Only add VGPR register classes for integer vector types, to avoid attempts
copying from VGPR to SGPR registers, which is not possible.
Patch By: Michel Dänzer
Signed-off-by: Michel Dänzer <michel.daenzer@amd.com>
Reviewed-by: Tom Stellard <thomas.stellard@amd.com>
llvm-svn: 174632
Use sub0-15 everywhere.
Patch by: Michel Dänzerr
Reviewed-by: Tom Stellard <thomas.stellard@amd.com>
Signed-off-by: Michel Dänzer <michel.daenzer@amd.com>
llvm-svn: 174610
These instructions compare two floating point values and return an
integer true (-1) or false (0) value.
When compiling code generated by the Mesa GLSL frontend, the SET*_DX10
instructions save us four instructions for most branch decisions that
use floating-point comparisons.
llvm-svn: 174609
A double inclusion will pretty much always be an error in TableGen, so
there's no point going on just to die with "def already defined" or
whatnot.
I'm not too thrilled about the "public: ... private: ..." to expose the
DependenciesMapTy, but I really didn't see a better way to keep that
type centralized. It's a smell that indicates that some refactoring is
needed to make this code more loosely coupled.
This should avoid all bugs of the same nature as PR15189.
llvm-svn: 174582
1. Moved a comment from ObjCARCOpts.cpp -> ObjCARCContract.cpp.
2. Removed a comment from ObjCARCOpts.cpp that was already moved to
ObjCARCAliasAnalysis.h/.cpp.
llvm-svn: 174581
account. Atoms use LEA for updating SP in prologs/epilogs, and the
exact LEA opcode depends on the data model.
Also reapplying the test case which was added and then reverted
(because of Atom failures), this time specifying explicitly the CPU in
addition to the triple. The test case now checks all variations (data
mode, cpu Atom vs. Core).
llvm-svn: 174542
Most of PPCCallingConv.td is used only by the 32-bit SVR4 ABI. Rename
things to clarify this. Also delete some code that's been commented out
for a long time.
llvm-svn: 174526
Only implemented for R600 so far. SI is missing implementations of a
few callbacks used by the Indirect Addressing pass and needs code to
handle frame indices.
At the moment R600 only supports array sizes of 16 dwords or less.
Register packing of vector types is currently disabled, which means that a
vec4 is stored in T0_X, T1_X, T2_X, T3_X, rather than T0_XYZW. In order
to correctly pack registers in all cases, we will need to implement an
analysis pass for R600 that determines the correct vector width for each
array.
v2:
- Add support for i8 zext load from stack.
- Coding style fixes
v3:
- Don't reserve registers for indirect addressing when it isn't
being used.
- Fix bug caused by LLVM limiting the number of SubRegIndex
declarations.
v4:
- Fix 64-bit defines
llvm-svn: 174525
Weakly defined symbols should evaluate to 0 if they're undefined at
link-time. This is impossible to do with the usual address generation
patterns, so we should use a literal pool entry to materlialise the
address.
llvm-svn: 174518
These instructions are a late addition to the architecture, and may
yet end up behind an optional attribute, but for now they're available
at all times.
llvm-svn: 174496
Attribute groups are of the form:
#0 = attributes { noinline "no-sse" "cpu"="cortex-a8" alignstack=4 }
Target-dependent attributes are represented as strings. Attributes can have
optional values associated with them. E.g., the "cpu" attribute has the value
"cortex-a8".
Target-independent attributes are listed as enums inside the attribute classes.
Multiple attribute groups can be referenced by the same object. In that case,
the attributes are merged together.
llvm-svn: 174493
Use the validateTargetOperandClass() hook to match literal '#0' operands in
InstAlias definitions. Previously this required per-instruction C++ munging of the
operand list, but not is handled as a natural part of the matcher. Much better.
No additional tests are required, as the pre-existing tests for these instructions
exercise the new behaviour as being functionally equivalent to the old.
llvm-svn: 174488
This is useful when parsing an object that references multiple attribute groups.
N.B. If both builders have alignments specified, then they should match!
llvm-svn: 174480
Failure: undefined symbol 'Lline_table_start0'.
Root-cause: we use a symbol subtraction to calculate at_stmt_list, but
the line table entries are not dumped in the assembly.
Fix: use zero instead of a symbol subtraction for Compile Unit 0.
llvm-svn: 174479
The stuff we're handing are all enums (Attribute::AttrKind), integers and
strings. Don't convert them to Constants, which is an unnecessary step here. The
rest of the changes are mostly mechanical.
llvm-svn: 174456
pointer in function prologs/epilogs. The opcodes should depend on the
data model (LP64 vs. ILP32) rather than the architecture bit-ness.
llvm-svn: 174446
is a vararg function.
The original code was examining flag OutputArg::IsFixed to determine whether
CC_MipsN_VarArg or CC_MipsN should be called. This is not correct, since this
flag is often set to false when the function being analyzed is a non-variadic
function.
llvm-svn: 174442
base point of a load, and the overall alignment of the load. This caused infinite loops in DAG combine with the
original application of this patch.
ORIGINAL COMMIT LOG:
When the target-independent DAGCombiner inferred a higher alignment for a load,
it would replace the load with one with the higher alignment. However, it did
not place the new load in the worklist, which prevented later DAG combines in
the same phase (for example, target-specific combines) from ever seeing it.
This patch corrects that oversight, and updates some tests whose output changed
due to slightly different DAGCombine outputs.
llvm-svn: 174431
All targets are now adding return value registers as implicit uses on
return instructions, and there is no longer a need for the live out
lists.
llvm-svn: 174417
Currently, when a fragment is relaxed, its size is modified, but its
offset is not (it gets laid out as a side effect of checking whether
it needs relaxation), then all subsequent fragments are invalidated
because their offsets need to change. When bundling is enabled,
relaxed fragments need to get laid out again, because the increase in
size may push it over a bundle boundary. So instead of only
invalidating subsequent fragments, also invalidate the fragment that
gets relaxed, which causes it to get laid out again.
This patch also fixes some trailing whitespace and fixes the
bundling-related debug output of MCFragments.
llvm-svn: 174401
Something very strange is going on with the output registers in this
target. Its ISelLowering code is inserting dangling CopyToReg nodes,
hoping that those physregs won't get clobbered before the RETURN.
This patch adds the output registers as implicit uses on RETURN
instructions in the custom emission pass. I'd much prefer to have those
CopyToReg nodes glued to the RETURNs, but I don't see how.
llvm-svn: 174400
The liveout lists are about to be removed from MRI, this is the only
place they were used after register allocation.
Get the live out V registers directly from the return instructions
instead.
llvm-svn: 174399
Use one intrinsic for all sorts of interpolation.
Use two separate unexpanded instructions to represent INTERP_XY and _ZW -
this will allow to eliminate one part if it's not used.
Track liveness of special interpolation regs instead of reserving them -
this will allow to reuse those regs, lowering reg pressure.
Patch By: Vadim Girlin
v2[Vincent Lejeune]: Rebased against current llvm master
Signed-off-by: Vadim Girlin <vadimgirlin@gmail.com>
Reviewed-by: Tom Stellard <thomas.stellard@amd.com>
llvm-svn: 174394
Emitting the function name allows us to check for it in the FileCheck
tests so we can make sure FileCheck is checking the output of the
correct function.
llvm-svn: 174392
Fixes 37 piglit tests and allows e.g. FlightGear to run with radeonsi.
Patch by: Michel Dänzer
Signed-off-by: Michel Dänzer <michel.daenzer@amd.com>
Reviewed-by: Tom Stellard <thomas.stellard@amd.com>
llvm-svn: 174391
In the loop vectorizer cost model, we used to ignore stores/loads of a pointer
type when computing the widest type within a loop. This meant that if we had
only stores/loads of pointers in a loop we would return a widest type of 8bits
(instead of 32 or 64 bit) and therefore a vector factor that was too big.
Now, if we see a consecutive store/load of pointers we use the size of a pointer
(from data layout).
This problem occured in SingleSource/Benchmarks/Shootout-C++/hash.cpp (reduced
test case is the first test in vector_ptr_load_store.ll).
radar://13139343
llvm-svn: 174377
This moves the bit twiddling and string fiddling functions required by other
parts of the backend into a separate library. Previously they resided in
AArch64Desc, which created a circular dependency between various components.
llvm-svn: 174369
and enables the instruction printer to print aliased
instructions.
Due to usage of RegisterOperands a change in common
code (utils/TableGen/AsmWriterEmitter.cpp) is required
to get the correct register value if it is a RegisterOperand.
Contributer: Vladimir Medic
llvm-svn: 174358
it would replace the load with one with the higher alignment. However, it did
not place the new load in the worklist, which prevented later DAG combines in
the same phase (for example, target-specific combines) from ever seeing it.
This patch corrects that oversight, and updates some tests whose output changed
due to slightly different DAGCombine outputs.
llvm-svn: 174343
Per discussion in rdar://13127907, we should emit a hard error only if
people write code where the requested alignment is larger than achievable
and assumes the low bits are zeros. A warning should be good enough when
we are not sure if the source code assumes the low bits are zeros.
rdar://13127907
llvm-svn: 174336
Rename the PARAMATTR_CODE_ENTRY to PARAMATTR_CODE_ENTRY_OLD. It will be replaced
by another encoding. Keep around the current LLVM attribute encoder/decoder
code, but move it to the bitcode directories so that no one's tempted to use
them.
llvm-svn: 174335
I didn't see those because the test case used "not grep". FileCheck the test and
XFAIL it, preserving the old optimization, so this can be fixed eventually.
llvm-svn: 174330
This required disabling a PowerPC optimization that did the following:
input:
x = BUILD_VECTOR <i32 16, i32 16, i32 16, i32 16>
lowered to:
tmp = BUILD_VECTOR <i32 8, i32 8, i32 8, i32 8>
x = ADD tmp, tmp
The add now gets folded immediately and we're back at the BUILD_VECTOR we
started from. I don't see a way to fix this currently so I left it disabled
for now.
Fix some trivially foldable X86 tests too.
llvm-svn: 174325
This change lets us bootstrap LLVM/Clang under ASan and MSan. It contains
fixes for 2 issues:
- X86JIT reads return address from stack, which MSan does not know is
initialized.
- bugpoint tests run binaries with RLIMIT_AS. This does not work with certain
Sanitizers.
We are no longer including config.h in Compiler.h with this change.
llvm-svn: 174306
Swift has a renaming dependency if we load into D subregisters. We don't have a
way of distinguishing between insertelement operations of values from loads and
other values. Therefore, we are pessimistic for now (The performance problem
showed up in example 14 of gcc-loops).
radar://13096933
llvm-svn: 174300
I am going to add in the actual test cases with the actual functionality
changes in a later patch because I want to include some test cases.
To be clear when I say no *TRUE* functionality change I mean that this
patch (like it says in the title) only contains getters/setters and sets
up a default initial value of the instance variable to false so that
this patch does not affect any other uses of Global Variable.h.
llvm-svn: 174295
Improve performance of iterating over children and accessing the member file
buffer by caching the file size and moving code out to the header.
This also makes getBuffer return a StringRef instead of a MemoryBuffer. Both
fixing a memory leak and removing a malloc.
This takes getBuffer from ~10% of the time in lld to unmeasurable.
llvm-svn: 174272
The main lists of debug info metadata attached to the compile_unit had an extra
layer of metadata nodes they went through for no apparent reason. This patch
removes that (& still passes just as much of the GDB 7.5 test suite). If anyone
can show evidence as to why these extra metadata nodes are there I'm open to
reverting this patch & documenting why they're there.
llvm-svn: 174266
Use the AttributeSet's iterators in AttrBuilder::hasAttributes() when
determining of the intersection of the AttrBuilder and AttributeSet is non-null.
llvm-svn: 174250
says, but that's a defect (to be filed). "Cls::purevfn()" is still an odr use.
Also fixes a bug in the previous patch that caused us to not mark the function
referenced just because we didn't want to mark it odr used.
llvm-svn: 174240
the SCEV vector size in LoopStrengthReduce. It is observed that
the BaseRegs vector size is 4 in most cases,
and elements are frequently copied when it is initialized as
SmallVector<const SCEV *, 2> BaseRegs.
Our benchmark results show that the compilation time performance
improved by ~0.5%.
Patch by Wan Xiaofei.
llvm-svn: 174219
1) allows the use of RIP-relative addressing in 32-bit LEA instructions under
x86-64 (ILP32 and LP64)
2) separates the size of address registers in 64-bit LEA instructions from
control by ILP32/LP64.
llvm-svn: 174208