llvm-project

Commit Graph

Author	SHA1	Message	Date
Sanjay Patel	e68f71574f	InstCombine: fsub nsz 0, X ==> fsub nsz -0.0, X Some day the backend may handle instruction-level fast math flags and make this transform unnecessary, but it's still better practice to use the canonical representation of fneg when possible (use a -0.0). This is a partial fix for PR20870 ( http://llvm.org/bugs/show_bug.cgi?id=20870 ). See also http://reviews.llvm.org/D6723. Differential Revision: http://reviews.llvm.org/D6731 llvm-svn: 225050	2014-12-31 22:14:05 +00:00
Sanjay Patel	ea3c802887	use -0.0 when creating an fneg instruction Backends recognize (-0.0 - X) as the canonical form for fneg and produce better code. Eg, ppc64 with 0.0: lis r2, ha16(LCPI0_0) lfs f0, lo16(LCPI0_0)(r2) fsubs f1, f0, f1 blr vs. -0.0: fneg f1, f1 blr Differential Revision: http://reviews.llvm.org/D6723 llvm-svn: 224583	2014-12-19 16:44:08 +00:00
Sanjay Patel	c699a6117b	fold: sqrt(x * x * y) -> fabs(x) * sqrt(y) If a square root call has an FP multiplication argument that can be reassociated, then we can hoist a repeated factor out of the square root call and into a fabs(). In the simplest case, this: y = sqrt(x * x); becomes this: y = fabs(x); This patch relies on an earlier optimization in instcombine or reassociate to put the multiplication tree into a canonical form, so we don't have to search over every permutation of the multiplication tree. Because there are no IR-level FastMathFlags for intrinsics (PR21290), we have to use function-level attributes to do this optimization. This needs to be fixed for both the intrinsics and in the backend. Differential Revision: http://reviews.llvm.org/D5787 llvm-svn: 219944	2014-10-16 18:48:17 +00:00
Benjamin Kramer	76b15d04ff	InstCombine: Refactor fmul/fdiv combines to handle vectors. llvm-svn: 199598	2014-01-19 13:36:27 +00:00
Owen Anderson	48b842ef7c	Fix more instances of dropped fast math flags when optimizing FADD instructions. All found by inspection (aka grep). llvm-svn: 199528	2014-01-18 00:48:14 +00:00
Owen Anderson	e7321660c1	Fix two cases where we could lose fast math flags when optimizing FADD expressions. llvm-svn: 199427	2014-01-16 21:26:02 +00:00
Shuxin Yang	3a7ca6ec87	[Fast-math] Disable "(C1/X)C2 => (C1C2)/X" if C1/X has multiple uses. If "C1/X" were having multiple uses, the only benefit of this transformation is to potentially shorten critical path. But it is at the cost of instroducing additional div. The additional div may or may not incur cost depending on how div is implemented. If it is implemented using Newton–Raphson iteration, it dosen't seem to incur any cost (FIXME). However, if the div blocks the entire pipeline, that sounds to be pretty expensive. Let CodeGen to take care this transformation. This patch sees 6% on a benchmark. rdar://15032743 llvm-svn: 191037	2013-09-19 21:13:46 +00:00
Stephen Lin	c1c7a1309c	Update Transforms tests to use CHECK-LABEL for easier debugging. No functionality change. This update was done with the following bash script: find test/Transforms -name ".ll" \| \ while read NAME; do echo "$NAME" if ! grep -q "^; RUN: llc" $NAME; then TEMP=`mktemp -t temp` cp $NAME $TEMP sed -n "s/^define [^@]@$[A-Za-z0-9_]$(.$/\1/p" < $NAME \| \ while read FUNC; do sed -i '' "s/;$.$$[A-Za-z0-9_]$:$ $@$FUNC$[( ]$\$/;\1\2-LABEL:\3@$FUNC(/g" $TEMP done mv $TEMP $NAME fi done llvm-svn: 186268	2013-07-14 01:42:54 +00:00
Shuxin Yang	389ed4b8f7	Fix a bug in fast-math fadd/fsub simplification. The problem is that the code mistakenly took for granted that following constructor is able to create an APFloat from a SIGNED integer: APFloat::APFloat(const fltSemantics &ourSemantics, integerPart value) rdar://13486998 llvm-svn: 177906	2013-03-25 20:43:41 +00:00
Shuxin Yang	2eca602f8b	Perform factorization as a last resort of unsafe fadd/fsub simplification. Rules include: 1)1 xy +/- xz => x*(y +/- z) (the order of operands dosen't matter) 2) y/x +/- z/x => (y +/- z)/x The transformation is disabled if the new add/sub expr "y +/- z" is a denormal/naz/inifinity. rdar://12911472 llvm-svn: 177088	2013-03-14 18:08:26 +00:00
Quentin Colombet	e684a6d4aa	Fix a bug in instcombine for fmul in fast math mode. The instcombine recognized pattern looks like: a = b * c d = a +/- Cst or a = b * c d = Cst +/- a When creating the new operands for fadd or fsub instruction following the related fmul, the first operand was created with the second original operand (M0 was created with C1) and the second with the first (M1 with Opnd0). The fix consists in creating the new operands with the appropriate original operand, i.e., M0 with Opnd0 and M1 with C1. llvm-svn: 176300	2013-02-28 21:12:40 +00:00
Michael Ilseman	1dd6f2a5ba	Preserve fast-math flags after reassociation and commutation. Update test cases llvm-svn: 174571	2013-02-07 01:40:15 +00:00
Michael Ilseman	10f2055812	whitespace llvm-svn: 174569	2013-02-07 01:27:13 +00:00
Shuxin Yang	e822745202	1. Hoist minus sign as high as possible in an attempt to reveal some optimization opportunities (in the enclosing supper-expressions). rule 1. (-0.0 - X ) * Y => -0.0 - (X * Y) if expression "-0.0 - X" has only one reference. rule 2. (0.0 - X ) * Y => -0.0 - (X * Y) if expression "0.0 - X" has only one reference, and the instruction is marked "noSignedZero". 2. Eliminate negation (The compiler was already able to handle these opt if the 0.0s are replaced with -0.0.) rule 3: (0.0 - X) * (0.0 - Y) => X * Y rule 4: (0.0 - X) * C => X * -C if the expr is flagged "noSignedZero". 3. Rule 5: (XY) X => (XX) Y if X!=Y and the expression is flagged with "UnsafeAlgebra". The purpose of this transformation is two-fold: a) to form a power expression (of X). b) potentially shorten the critical path: After transformation, the latency of the instruction Y is amortized by the expression of XX, and therefore Y is in a "less critical" position compared to what it was before the transformation. 4. Remove the InstCombine code about simplifiying "X select". The reasons are following: a) The "select" is somewhat architecture-dependent, therefore the higher level optimizers are not able to precisely predict if the simplification really yields any performance improvement or not. b) The "select" operator is bit complicate, and tends to obscure optimization opportunities. It is btter to keep it as low as possible in expr tree, and let CodeGen to tackle the optimization. llvm-svn: 172551	2013-01-15 21:09:32 +00:00
Shuxin Yang	320f52a4b0	This change is to implement following rules under the condition C_A and/or C_R --------------------------------------------------------------------------- C_A: reassociation is allowed C_R: reciprocal of a constant C is appropriate, which means - 1/C is exact, or - reciprocal is allowed and 1/C is neither a special value nor a denormal. ----------------------------------------------------------------------------- rule1: (X/C1) / C2 => X / (C2C1) (if C_A) => X (1/(C2C1)) (if C_A && C_R) rule 2: XC1 / C2 => X * (C1/C2) if C_A rule 3: (X/Y)/Z = > X/(YZ) (if C_A && at least one of Y and Z is symbolic value) rule 4: Z/(X/Y) = > (ZY)/X (similar to rule3) rule 5: C1/(XC2) => (C1/C2) / X (if C_A) rule 6: C1/(X/C2) => (C1C2) / X (if C_A) rule 7: C1/(C2/X) => (C1/C2) * X (if C_A) llvm-svn: 172488	2013-01-14 22:48:41 +00:00
Shuxin Yang	f0537ab681	Consider expression "0.0 - X" as the negation of X if - this expression is explicitly marked no-signed-zero, or - no-signed-zero of this expression can be derived from some context. llvm-svn: 171922	2013-01-09 00:13:41 +00:00
Shuxin Yang	df0e61e793	This change is to implement following rules: o. X/C1 * C2 => X * (C2/C1) (if C2/C1 is neither special FP nor denormal) o. X/C1 * C2 -> X/(C1/C2) (if C2/C1 is either specical FP or denormal, but C1/C2 is a normal Fp) Let MDC denote multiplication or dividion with one & only one operand being a constant o. (MDC ± C1) * C2 => (MDC * C2) ± (C1 * C2) (so long as the constant-folding doesn't yield any denormal or special value) llvm-svn: 171793	2013-01-07 21:39:23 +00:00
Shuxin Yang	37a1efe1c6	rdar://12801297 InstCombine for unsafe floating-point add/sub. llvm-svn: 170471	2012-12-18 23:10:12 +00:00
Shuxin Yang	f8e9a5a061	rdar://12753946 Implement rule : "x * (select cond 1.0, 0.0) -> select cond x, 0.0" llvm-svn: 170226	2012-12-14 18:46:06 +00:00
Shuxin Yang	f265351491	fix a typo llvm-svn: 168909	2012-11-29 18:09:37 +00:00
Shuxin Yang	01ab5d718b	Instruction::isAssociative() returns true for fmul/fadd if they are tagged "unsafe" mode. Approved by: Eli and Michael. llvm-svn: 168848	2012-11-29 01:47:31 +00:00

21 Commits