fsub X, +0 ==> X
fsub X, -0 ==> X, when we know X is not -0
fsub +/-0.0, (fsub -0.0, X) ==> X
fsub nsz +/-0.0, (fsub +/-0.0, X) ==> X
fsub nnan ninf X, X ==> 0.0
fadd nsz X, 0 ==> X
fadd [nnan ninf] X, (fsub [nnan ninf] 0, X) ==> 0
where nnan and ninf have to occur at least once somewhere in this expression
fmul X, 1.0 ==> X
llvm-svn: 169940
m_ConstantFP - match and bind a float constant
m_SpecificConstantFP - match a specific floating point value or vector of floats of that value
m_FPOne - match a floating point 1.0 or vector of 1.0s
m_NegZero - match -0.0
m_AnyZero - match 0 or -0.0
llvm-svn: 169939
ScalarTargetTransformInfo::getIntImmCost() instead. "Legal" is a poorly defined
term for something like integer immediate materialization. It is always possible
to materialize an integer immediate. Whether to use it for memcpy expansion is
more a "cost" conceern.
llvm-svn: 169929
Using this mechanism, making sure that the options to pass a summary string or a named summary to frame variable do not have invalid values
<rdar://problem/11576143>
llvm-svn: 169927
"self" when those pointers are in registers.
Previously in this case the IRInterpreter would
handle them just as if the user had typed in
"$rdi", which isn't safe because $rdi is passed
in through the argument struct.
Now we correctly break out all three cases (i.e.,
normal variables in registers, $reg, and this/self),
and handle them in a way that's a little bit easier
to read and change.
This results in more accurate printing of "this" and
"self" pointers all around. I have strengthened the
optimized-code test case for Objective-C to ensure
that we catch regressions in this area reliably in
the future.
<rdar://problem/12693963>
llvm-svn: 169924
Given a thread-local symbol x with global-dynamic access, the generated
code to obtain x's address is:
Instruction Relocation Symbol
addis ra,r2,x@got@tlsgd@ha R_PPC64_GOT_TLSGD16_HA x
addi r3,ra,x@got@tlsgd@l R_PPC64_GOT_TLSGD16_L x
bl __tls_get_addr(x@tlsgd) R_PPC64_TLSGD x
R_PPC64_REL24 __tls_get_addr
nop
<use address in r3>
The implementation borrows from the medium code model work for introducing
special forms of ADDIS and ADDI into the DAG representation. This is made
slightly more complicated by having to introduce a call to the external
function __tls_get_addr. Using the full call machinery is overkill and,
more importantly, makes it difficult to add a special relocation. So I've
introduced another opcode GET_TLS_ADDR to represent the function call, and
surrounded it with register copies to set up the parameter and return value.
Most of the code is pretty straightforward. I ran into one peculiarity
when I introduced a new PPC opcode BL8_NOP_ELF_TLSGD, which is just like
BL8_NOP_ELF except that it takes another parameter to represent the symbol
("x" above) that requires a relocation on the call. Something in the
TblGen machinery causes BL8_NOP_ELF and BL8_NOP_ELF_TLSGD to be treated
identically during the emit phase, so this second operand was never
visited to generate relocations. This is the reason for the slightly
messy workaround in PPCMCCodeEmitter.cpp:getDirectBrEncoding().
Two new tests are included to demonstrate correct external assembly and
correct generation of relocations using the integrated assembler.
Comments welcome!
Thanks,
Bill
llvm-svn: 169910
Add -fslp-vectorize (with -ftree-slp-vectorize as an alias for gcc compatibility)
to provide a way to enable the basic-block vectorization pass. This uses the same
acronym as gcc, superword-level parallelism (SLP), also common in the literature,
to refer to basic-block vectorization.
Nadav suggested this as a follow-up to the adding of -fvectorize.
llvm-svn: 169909
Instead of doing a binary search over the whole diagnostic table (which weighs
a whopping 48k on x86_64), use the existing enums to compute the index in the
table. This avoids loading any unneeded data from the table and avoids littering
CPU caches with it. This code is in a hot path for code with many diagnostics.
1% speedup on -fsyntax-only gcc.c, which emits a lot of warnings.
llvm-svn: 169890
because that method is only getting called for MCInstFragment. These
fragments aren't even generated when RelaxAll is set, which is why the
flag reference here is superfluous. Removing it simplifies the code
with no harmful effects.
An assertion is added higher up to make sure this path is never
reached.
llvm-svn: 169886
Since now we have an autogenerated TOC, a manually written table of all passes
was removed.
Patch by Anthony Mykhailenko with small fixes by me.
llvm-svn: 169867
Summary:
Also rename DumpDeclarator() to dumpDecl(). Once Decl dumping is added, these will be the two main methods of the class, so this is just for consistency in naming.
There was a DumpStmt() method already, but there was no point in having it, so I have merged it into VisitStmt(). Similarly, DumpExpr() is merged into VisitExpr().
Reviewers: alexfh
Reviewed By: alexfh
CC: cfe-commits, alexfh
Differential Revision: http://llvm-reviews.chandlerc.com/D156
llvm-svn: 169865
Use explicitely aligned store and load instructions to deal with argument and
retval shadow. This matters when an argument's alignment is higher than
__msan_param_tls alignment (which is the case with __m128i).
llvm-svn: 169859