If there is a store followed by a store with the same value to the same location, then the store is dead/noop. It can be removed.
This problem is found in spec2006-197.parser.
For example,
stur w10, [x11, #-4]
stur w10, [x11, #-4]
Then one of the two stur instructions can be removed.
Patch by David Xu!
llvm-svn: 218569
See http://reviews.llvm.org/D5495 for more details.
These are changes that are part of an effort to support building llgs, within the AOSP source tree, using the Android.mk
build system, when using the llvm/clang/lldb git repos from AOSP replaced with the experimental ones currently in
github.com/tfiala/aosp-{llvm,clang,lldb,compiler-rt}.
llvm-svn: 218568
llvm::huge_valf is defined in a header file, so it is initialized
multiple times in every compiled unit upon program startup.
With non-VC compilers huge_valf is set to a HUGE_VALF which the
compiler can probably optimize out.
With VC numeric_limits<float>::infinity() does not return a number
but a runtime structure member which therotically may change
between calls so the compiler does not optimize out the
initialization and it happens many times. It can be easily seen by
placing a breakpoint on the initialization line.
This patch moves llvm::huge_valf initialization to a source file
instead of the header.
llvm-svn: 218567
If too many parameters are involved in accesses used to create RTCs
we might end up with enormous compile times and RTC expressions.
The reason is that the lexmin/lexmax is dependent on all these
parameters and isl might need to create a case for every "ordering"
of them (e.g., p0 <= p1 <= p2, p1 <= p0 <= p2, ...).
The exact number of parameters allowed in accesses is defined by the
command line option -polly-rtc-max-parameters=XXX and set by default
to 8.
Differential Revision: http://reviews.llvm.org/D5500
llvm-svn: 218566
I spotted this by inspection when debugging something else, so I have no
test case what-so-ever, and am not even sure it is possible to
realistically trigger the bug. But this is what was intended here.
llvm-svn: 218565
and in the target shuffle combining when trying to widen vector
elements.
Previously only one of these was correct, and we didn't correctly
propagate zeroing target shuffle masks (which have a different sentinel
value from undef in non- target shuffle masks now). This isn't just
a missed optimization, this caused us to drop zeroing shuffles on the
floor and miscompile code. The added test case is one example of that.
There are other fixes to the test suite as a consequence of this as well
as restoring the undef elements in some of the masks that were lost when
I brought sanity to the actual *value* of the undef and zero sentinels.
I've also just cleaned up some of the PSHUFD and PSHUFLW and PSHUFHW
combining code, but that code really needs to go. It was a nice initial
attempt, but it isn't very principled and the recursive shuffle combiner
is much more powerful.
llvm-svn: 218562
to significantly more sane sentinels. Notably, everywhere else in the
backend's representation of shuffles uses '-1' to represent undef. The
target shuffle masks really shouldn't diverge from that, especially as
in a few places they are manipulated by shared code.
This causes us to lose some undef lanes in various test masks. I want to
get these back, but technically it isn't invalid and there are a *lot*
of bugs here so I want to try to establish a saner baseline for fixing
some of the bugs by aligning the specific senitnel values used.
llvm-svn: 218561
Tested two pending stops before notification, where one of the pending stop
requirements was already known to be stopped.
Tested pending thread stop before notification, then reporting thread with
pending stop died and verifies pending notification is made.
llvm-svn: 218559
Glad I did - caught a bug where the auto variable was not a reference
to a set and instead was a copy. I need to review rules on that!
llvm-svn: 218558
This is purely refactoring. No functional changes intended. PowerPC is the only target
that is currently using this interface.
The ultimate goal is to allow targets other than PowerPC (certainly X86 and Aarch64) to turn this:
z = y / sqrt(x)
into:
z = y * rsqrte(x)
And:
z = y / x
into:
z = y * rcpe(x)
using whatever HW magic they can use. See http://llvm.org/bugs/show_bug.cgi?id=20900 .
There is one hook in TargetLowering to get the target-specific opcode for an estimate instruction
along with the number of refinement steps needed to make the estimate usable.
Differential Revision: http://reviews.llvm.org/D5484
llvm-svn: 218553
Otherwise we can end up silently skipping an import. If we happen to be
building another module at the time, we may build a mysteriously broken
module and not know why it seems to be missing symbols.
llvm-svn: 218552
Users of getSectionContents shouldn't try to pass in BSS or virtual
sections. In all instances, this is a bug in the code calling this
routine.
N.B. Some COFF implementations (like CL) will mark their BSS sections as
taking space on disk. This would confuse COFFObjectFile into thinking
the section is larger than the file.
llvm-svn: 218549
So in fully linked images when a call is made through a stub it now gets a
comment like the following in the disassembly:
callq 0x100000f6c ## symbol stub for: _printf
indicating the call is to a symbol stub and which symbol it is for. This is
done for branch reference types and seeing if the branch target is in a stub
section and if so using the indirect symbol table entry for that stub and
using that symbol table entries symbol name.
llvm-svn: 218546
files in this directory. If it should be defined anywhere, it should be defined
when building lib/LTO/LTOCodeGenerator.cpp, but we've not had it defined there
for quite some time, so that doesn't really seem to be very important. (It also
would slow down the modules build by creating extra module variants.)
llvm-svn: 218544
lldb sets the variable SHARED_LIBRARY to 1, which breaks this conditional,
because older versions of CMake interpret
if ("${t}" STREQUAL "SHARED_LIBRARY")
as meaning
if ("${t}" STREQUAL "1")
in this case. Change the conditional so it does the right thing with both old
and new CMakes.
llvm-svn: 218542
It makes no sense to link in sanitizer runtimes in this case: the user
probably doesn't want to see any system/toolchain libs in his link if he
provides these flags, and the link will most likely fail anyway - as sanitizer
runtimes depend on libpthread, libdl, libc etc.
Also, see discussion in https://code.google.com/p/address-sanitizer/issues/detail?id=344
llvm-svn: 218541
that managed to elude all of my fuzz testing historically. =/
Something changed to allow this code path to actually be exercised and
it was doing bad things. It is especially heavily exercised by the
patterns that emerge when doing AVX shuffles that end up lowered through
the 128-bit code path.
llvm-svn: 218540
Reviewed at http://reviews.llvm.org/D4527
Fixed a test case failure on 32-bit Linux, I did right shift on intptr_t, instead it should have been uintptr_t.
llvm-svn: 218538
This has weird operand requirements so it's worthwhile
to have very strict checks for its operands.
Add different combinations of SGPR operands.
llvm-svn: 218535
Instead of moving the first SGPR that is different than the first,
legalize the operand that requires the fewest moves if one
SGPR is used for multiple operands.
This saves extra moves and is also required for some instructions
which require that the same operand be used for multiple operands.
llvm-svn: 218532
Disable the SGPR usage restriction parts of the DAG legalizeOperands.
It now should only be doing immediate folding until it can be replaced
later. The real legalization work is now done by the other
SIInstrInfo::legalizeOperands
llvm-svn: 218531
The base implementation of commuteInstruction is used
in some cases, but it turns out this has been broken for a
long time since modifiers were inserted between the real operands.
The base implementation of commuteInstruction also fails on immediates,
which also needs to be fixed.
llvm-svn: 218530
e.g. v_cndmask_b32 requires the condition operand be an SGPR.
If one of the source operands were an SGPR, that would be considered
the one SGPR use and the condition operand would be illegally moved.
llvm-svn: 218529
This needs a test, but I'm not sure if it is currently possible and
I originally hit it due to a bug. Right now the only global address
operands have no reason to be VALU instructions, although it
theoretically could be a problem.
llvm-svn: 218528
No test since the current SIISelLowering::legalizeOperands
effectively hides this, and the general uses seem to only fire
on SALU instructions which don't have modifiers between
the operands.
When trying to use legalizeOperands immediately after
instruction selection, it now sees a lot more patterns
it did not see before which break on this.
llvm-svn: 218527
No tests hit this, and I don't see any way a GlobalAddress
node would survive beyond lowering on SI. It it would, the
move should probably be inserted by selection.
llvm-svn: 218526
The annotation instructions are dropped during codegen and have no
impact on size. In some cases, the annotations were preventing the
unroller from unrolling a loop because the annotation calls were
pushing the cost over the unrolling threshold.
Differential Revision: http://reviews.llvm.org/D5335
llvm-svn: 218525
layer of tie-breaking sorting, it really helps to check that you're in
a tie first. =] Otherwise the whole thing cycles infinitely. Test case
added, another one found through fuzz testing.
llvm-svn: 218523
AVX support.
New test cases included. Note that none of the existing test cases
covered these buggy code paths. =/ Also, it is clear from this that
SHUFPS and SHUFPD are the most bug prone shuffle instructions in x86. =[
These were all detected by fuzz-testing. (I <3 fuzz testing.)
llvm-svn: 218522
This patch makes the ARM backend transform 3 operand instructions such as
'adds/subs' to the 2 operand version of the same instruction if the first
two register operands are the same.
Example: 'adds r0, r0, #1' will is transformed to 'adds r0, #1'.
Currently for some instructions such as 'adds' if you try to assemble
'adds r0, r0, #8' for thumb v6m the assembler would throw an error message
because the immediate cannot be encoded using 3 bits.
The backend should be smart enough to transform the instruction to
'adds r0, #8', which allows for larger immediate constants.
Patch by Ranjeet Singh.
llvm-svn: 218521
If a base class declares a destructor, we will add the implicit
destructor for the subclass in
ActOnFields -> AddImplicitlyDeclaredMembersToClass
But in Objective C++, we did not compute whether we have a trivial
destructor until after that in
CXXRecordDecl::completeDefinition()
This was leading to a mismatch between the class, which thought it had
no trivial destructor, and the CXXDestructorDecl, which considered
itself trivial. It turns out the reason we delayed setting this until
completeDefinition() was for a warning that has since been removed as
part of -Warc-abi, so we just do it eagerly now.
llvm-svn: 218520
Changed files:
config-ix.cmake: Enabled UBSan for MIPS32
sanitizer_stacktrace.cc: Program counter for MIPS32 is four byte aligned
and a delay slot so subtracted PC by 8 for getting call site address.
cast-overflow.cpp: Added big endian support for this test case.
Patch by Sagar Thakur.
Differential Revision: http://reviews.llvm.org/D4881
llvm-svn: 218519