By itself, this does not have much of an effect, but only because in the default
configuration the full cycle checks are used only for small problem sizes.
This is part of a general cleanup of uses of iteration over std::multimap
ranges only for the purpose of checking membership.
No functionality change intended.
llvm-svn: 174856
function is successfully handled by fast-isel. That's because function
arguments are *always* handled by SDISel. Introduce FastLowerArguments to
allow each target to provide hook to handle formal argument lowering.
As a proof-of-concept, add ARMFastIsel::FastLowerArguments to handle
functions with 4 or fewer scalar integer (i8, i16, or i32) arguments. It
completely eliminates the need for SDISel for trivial functions.
rdar://13163905
llvm-svn: 174855
I have some uncommitted changes to the cast code that catch this sort of thing
at compile-time but I still need to do some other cleanup before I can enable
it.
llvm-svn: 174853
support for updating SlotIndexes to MachineBasicBlock::SplitCriticalEdge(). This
calls renumberIndexes() every time; it should be improved to only renumber
locally.
llvm-svn: 174851
This reads the attribute groups. It currently doesn't do anything with them.
NOTE: In the commit to the bitcode writer, the format *may* change in the near
future. Which means that this code would also change.
llvm-svn: 174849
This is some initial code for emitting the attribute groups into the bitcode.
NOTE: This format *may* change! Do not rely upon the attribute groups' bitcode
not changing.
llvm-svn: 174845
present, it currently verifies them with the MachineVerifier, and this passed
all of the test cases in 'make check' (when accounting for existing verifier
errors). There were some assertion failures in the two-address pass, but they
also happened on code without phis and look like they are caused by different
kill flags from LiveIntervals.
The only part that doesn't work is the critical edge splitting heuristic,
because there isn't currently an efficient way to update LiveIntervals after
splitting an edge. I'll probably start by implementing the slow fallback and
test that it works before tackling the fast path for single-block ranges. The
existing code that updates LiveVariables is fairly slow as it is.
There isn't a command-line option for enabling this; instead, just edit
PHIElimination.cpp to require LiveIntervals.
llvm-svn: 174831
The original syntax for the attribute groups was ambiguous. For example:
declare void @foo() #1#0 = attributes { noinline }
The '#0' would be parsed as an attribute reference for '@foo' and not as a
top-level entity. In order to continue forward while waiting for a decision on
what the correct syntax is, I'm changing it to this instead:
declare void @foo() #1
attributes #0 = { noinline }
Repeat: This is TEMPORARY until we decide what the correct syntax should be.
llvm-svn: 174813
bitcode writer would generate abbrev records saying that the abbrev should be
filled with fixed zero-bit bitfields (this happens in the .bc writer when
the number of types used in a module is exactly one, since log2(1) == 0).
In this case, just handle it as a literal zero. We can't "just fix" the writer
without breaking compatibility with existing bc files, so have the abbrev reader
do the substitution.
Strengthen the assert in read to reject reads of zero bits so we catch such
crimes in the future, and remove the special case designed to handle this.
llvm-svn: 174801
instead of always 32-bits at a time) with two changes:
1. Make Read(0) always return zero without affecting the state of our cursor.
2. Hack word_t to always be 32 bits, as staging.
These two caveats will change shortly.
llvm-svn: 174800
This uses a liveness algorithm that does not depend on data from the
LiveVariables analysis, it is the first step towards removing
LiveVariables completely.
llvm-svn: 174774
check_cxx_symbol_exists requires CMake 2.8.6, so even though I
recommended it to Owen it's probably better to stay away for now.
This check is not technically correct because we're checking <math.h>
but then using <cmath> in the actual code, but if we run into problems we
can do the same sort of dance as isinf() and isnan() where we check /both/
headers and then write a wrapper header around them.
llvm-svn: 174773
to use -Wfoo instead of -Wno-foo. This works around a bug in some versions of
gcc, where it will silently accept an unknown -Wno-foo option, but will
generate an error for a compile which uses -Wno-foo if that compile also
triggers any warnings.
llvm-svn: 174770
This fixes a couple of bugs and incorrect assumptions,
in total four more piglit tests now pass.
v2: fix small bug in the dominator updating
Patch by: Christian König
Signed-off-by: Christian König <christian.koenig@amd.com>
llvm-svn: 174762
Patch by: Christian König
Intersecting loop handling was wrong.
Signed-off-by: Christian König <christian.koenig@amd.com>
Tested-by: Michel Dänzer <michel.daenzer@amd.com>
llvm-svn: 174761
Otherwise we sometimes produce invalid code.
Patch by: Christian König
Signed-off-by: Christian König <christian.koenig@amd.com>
Tested-by: Michel Dänzer <michel.daenzer@amd.com>
llvm-svn: 174760
This reverts r171041. This was a nice idea that didn't work out well.
Clang warnings need to be associated with warning groups so that they can
be selectively disabled, promoted to errors, etc. This simplistic patch didn't
allow for that. Enhancing it to provide some way for the backend to specify
a front-end warning type seems like overkill for the few uses of this, at
least for now.
llvm-svn: 174748
same so we put in the comment field an indicator when we think we are
emitting the 16 bit version. For the direct object emitter, the difference is
important as well as for other passes which need an accurate count of
program size. There will be other similar putbacks to this for various
instructions.
llvm-svn: 174747
Previously, even when a pre-increment load or store was generated,
we often needed to keep a copy of the original base register for use
with other offsets. If all of these offsets are constants (including
the offset which was combined into the addressing mode), then this is
clearly unnecessary. This change adjusts these other offsets to use the
new incremented address.
llvm-svn: 174746
This is a follow-up to the cost-model change in r174713 which splits
the cost of a memory operation between the address computation and the
actual memory access. In r174713, this cost is always added to the
memory operation cost, and so BBVectorize will do the same.
Currently, this new cost function is used only by ARM, and I don't
have any ARM test cases for BBVectorize. Assistance in generating some
good ARM test cases for BBVectorize would be greatly appreciated!
llvm-svn: 174743
Aside from the question of whether we report a warning or an error when we
can't satisfy a requested stack object alignment, the current implementation
of this is not good. We're not providing any source location in the diagnostics
and the current warning is not connected to any warning group so you can't
control it. We could improve the source location somewhat, but we can do a
much better job if this check is implemented in the front-end, so let's do that
instead. <rdar://problem/13127907>
llvm-svn: 174741
This updates the current references to links that work for me.
In the future, we should update the list of references itself to provide
information on newer architecture variants.
Thanks to Sean Silva for pointing out that the current links were broken!
llvm-svn: 174739
Thanks to help from Nadav and Hal, I have a more reasonable (and even
correct!) approach. This specifically penalizes the insertelement
and extractelement operations for the performance hit that will occur
on PowerPC processors.
llvm-svn: 174725
isn't using the default calling convention. However, if the transformation is
from a call to inline IR, then the calling convention doesn't matter.
rdar://13157990
llvm-svn: 174724
of lines which weren't being explicitly looked at and were printing incorrect results. These
values clearly must lie within 32 bits, so the casts are definitely safe.
llvm-svn: 174717
Adds a function to target transform info to query for the cost of address
computation. The cost model analysis pass now also queries this interface.
The code in LoopVectorize adds the cost of address computation as part of the
memory instruction cost calculation. Only there, we know whether the instruction
will be scalarized or not.
Increase the penality for inserting in to D registers on swift. This becomes
necessary because we now always assume that address computation has a cost and
three is a closer value to the architecture.
radar://13097204
llvm-svn: 174713
Attribute references are of this form:
define void @foo() #0#1#2 { ... }
Parse them for function attributes. If there's more than one reference, then
they are merged together.
llvm-svn: 174697
allowed size for the instruction. This code uses RegScavenger to fix this.
We sometimes need 2 registers for Mips16 so we must handle things
differently than how register scavenger is normally used.
llvm-svn: 174696
Add #include <unistd.h> to OProfileWrapper.cpp. This provides the declarations for 'read' and 'close' that are otherwise missing, and result in 'error: <foo> was not declared in this scope'.
This matches the issue as reported in bug 15055 "Can no longer compile LLVM with --with-oprofile"
llvm-svn: 174661
Certain vector operations don't vectorize well with the current
PowerPC implementation. Element insert/extract performs poorly
without VSX support because Altivec requires going through memory.
SREM, UREM, and VSELECT all produce bad scalar code.
There's a lot of work to do for the cost model before
autovectorization will be tuned well, and this is not an attempt to
address the larger problem.
llvm-svn: 174660
Remove all the unused code.
Patch by: Christian König
Signed-off-by: Christian König <christian.koenig@amd.com>
Reviewed-by: Tom Stellard <thomas.stellard@amd.com>
llvm-svn: 174656
Allows nexuiz to run with radeonsi.
Patch by: Michel Dänzer
Signed-off-by: Michel Dänzer <michel.daenzer@amd.com>
Reviewed-by: Tom Stellard <thomas.stellard@amd.com>
llvm-svn: 174655
20 more little piglits with radeonsi.
Patch by: Michel Dänzer
Signed-off-by: Michel Dänzer <michel.daenzer@amd.com>
Reviewed-by: Tom Stellard <thomas.stellard@amd.com>
llvm-svn: 174654
The _SGPR variants where wrong.
Patch by: Christian König
Signed-off-by: Christian König <christian.koenig@amd.com>
Reviewed-by: Tom Stellard <thomas.stellard@amd.com>
llvm-svn: 174653
v2: rebased on current upstream
Patch by: Christian König
Signed-off-by: Christian König <christian.koenig@amd.com>
Reviewed-by: Tom Stellard <thomas.stellard@amd.com>
llvm-svn: 174652
This is for the case when no processor is passed to the backend. This
prevents the
'' is not a recognized processor for this target (ignoring processor)
warning from being generated by clang.
llvm-svn: 174651
We don't want too many classes in a pass and the classes obscure the details. I
was going a little overboard with object modeling here. Replace classes by
generic code that handles both loads and stores.
No functionality change intended.
llvm-svn: 174646
PR15138 was opened because of a segfault in the Bitcode writer.
The actual issue ended up being a bug in APInt where calls to
APInt::getActiveWords returns a bogus value when the APInt value
is 0. This patch fixes the problem by ensuring that getActiveWords
returns 1 for 0 valued APInts.
llvm-svn: 174641
Handle vectors of 1 to 16 integers.
Change the intrinsic names to prevent the wrong one from being selected at
runtime due to the overloading.
Patch By: Michel Dänzer
Signed-off-by: Michel Dänzer <michel.daenzer@amd.com>
Reviewed-by: Tom Stellard <thomas.stellard@amd.com>
llvm-svn: 174633
v1i32, v2i32, v8i32 and v16i32.
Only add VGPR register classes for integer vector types, to avoid attempts
copying from VGPR to SGPR registers, which is not possible.
Patch By: Michel Dänzer
Signed-off-by: Michel Dänzer <michel.daenzer@amd.com>
Reviewed-by: Tom Stellard <thomas.stellard@amd.com>
llvm-svn: 174632
Use sub0-15 everywhere.
Patch by: Michel Dänzerr
Reviewed-by: Tom Stellard <thomas.stellard@amd.com>
Signed-off-by: Michel Dänzer <michel.daenzer@amd.com>
llvm-svn: 174610
These instructions compare two floating point values and return an
integer true (-1) or false (0) value.
When compiling code generated by the Mesa GLSL frontend, the SET*_DX10
instructions save us four instructions for most branch decisions that
use floating-point comparisons.
llvm-svn: 174609