When unsafe algerbra is allowed calls to cabs(r) can be replaced by:
sqrt(creal(r)*creal(r) + cimag(r)*cimag(r))
Patch by Paul Walker, thanks!
Differential Revision: https://reviews.llvm.org/D40069
llvm-svn: 320901
As the first test shows, we could transform an llvm intrinsic which never sets errno
into a libcall which could set errno (even though it's marked readnone?), so that's
not ideal.
It's possible that we can also transform a libcall which could set errno to an
intrinsic given the fast-math-flags constraint, but that's deferred to determine
exactly which set of FMF are needed.
Differential Revision: https://reviews.llvm.org/D40150
llvm-svn: 318628
As discussed on llvm-dev:
http://lists.llvm.org/pipermail/llvm-dev/2016-November/107104.html
and again more recently:
http://lists.llvm.org/pipermail/llvm-dev/2017-October/118118.html
...this is a step in cleaning up our fast-math-flags implementation in IR to better match
the capabilities of both clang's user-visible flags and the backend's flags for SDNode.
As proposed in the above threads, we're replacing the 'UnsafeAlgebra' bit (which had the
'umbrella' meaning that all flags are set) with a new bit that only applies to algebraic
reassociation - 'AllowReassoc'.
We're also adding a bit to allow approximations for library functions called 'ApproxFunc'
(this was initially proposed as 'libm' or similar).
...and we're out of bits. 7 bits ought to be enough for anyone, right? :) FWIW, I did
look at getting this out of SubclassOptionalData via SubclassData (spacious 16-bits),
but that's apparently already used for other purposes. Also, I don't think we can just
add a field to FPMathOperator because Operator is not intended to be instantiated.
We'll defer movement of FMF to another day.
We keep the 'fast' keyword. I thought about removing that, but seeing IR like this:
%f.fast = fadd reassoc nnan ninf nsz arcp contract afn float %op1, %op2
...made me think we want to keep the shortcut synonym.
Finally, this change is binary incompatible with existing IR as seen in the
compatibility tests. This statement:
"Newer releases can ignore features from older releases, but they cannot miscompile
them. For example, if nsw is ever replaced with something else, dropping it would be
a valid way to upgrade the IR."
( http://llvm.org/docs/DeveloperPolicy.html#ir-backwards-compatibility )
...provides the flexibility we want to make this change without requiring a new IR
version. Ie, we're not loosening the FP strictness of existing IR. At worst, we will
fail to optimize some previously 'fast' code because it's no longer recognized as
'fast'. This should get fixed as we audit/squash all of the uses of 'isFast()'.
Note: an inter-dependent clang commit to use the new API name should closely follow
commit.
Differential Revision: https://reviews.llvm.org/D39304
llvm-svn: 317488
parameterized emit() calls
Summary: This is not functional change to adopt new emit() API added in r313691.
Reviewed By: anemet
Subscribers: llvm-commits
Differential Revision: https://reviews.llvm.org/D38285
llvm-svn: 315476
Usually the frontend communicates the size of wchar_t via metadata and
we can optimize wcslen (and possibly other calls in the future). In
cases without the wchar_size metadata we would previously try to guess
the correct size based on the target triple; however this is fragile to
keep up to date and may miss users manually changing the size via flags.
Better be safe and stop guessing and optimizing if the frontend didn't
communicate the size.
Differential Revision: https://reviews.llvm.org/D38106
llvm-svn: 314185
The 1st try was reverted because it could inf-loop by creating a dead instruction.
Fixed that to not happen and added a test case to verify.
Original commit message:
Try to fold:
memcmp(X, C, ConstantLength) == 0 --> load X == *C
Without this change, we're unnecessarily checking the alignment of the constant data,
so we miss the transform in the first 2 tests in the patch.
I noted this shortcoming of LibCallSimpifier in one of the recent CGP memcmp expansion
patches. This doesn't help the example in:
https://bugs.llvm.org/show_bug.cgi?id=34032#c13
...directly, but it's worth short-circuiting more of these simple cases since we're
already trying to do that.
The benefit of transforming to load+cmp is that existing IR analysis/transforms may
further simplify that code. For example, if the load of the variable is common to
multiple memcmp calls, CSE can remove the duplicate instructions.
Differential Revision: https://reviews.llvm.org/D36922
llvm-svn: 311366
Try to fold:
memcmp(X, C, ConstantLength) == 0 --> load X == *C
Without this change, we're unnecessarily checking the alignment of the constant data,
so we miss the transform in the first 2 tests in the patch.
I noted this shortcoming of LibCallSimpifier in one of the recent CGP memcmp expansion
patches. This doesn't help the example in:
https://bugs.llvm.org/show_bug.cgi?id=34032#c13
...directly, but it's worth short-circuiting more of these simple cases since we're
already trying to do that.
The benefit of transforming to load+cmp is that existing IR analysis/transforms may
further simplify that code. For example, if the load of the variable is common to
multiple memcmp calls, CSE can remove the duplicate instructions.
Differential Revision: https://reviews.llvm.org/D36922
llvm-svn: 311333
Summary:
This changes SimplifyLibCalls to use the new OptimizationRemarkEmitter
API.
In fact, as SimplifyLibCalls is only ever called via InstCombine,
(as far as I can tell) the OptimizationRemarkEmitter is added there,
and then passed through to SimplifyLibCalls later.
I have avoided changing any remark text.
This closes PR33787
Patch by Sam Elliott!
Reviewers: anemet, davide
Reviewed By: anemet
Subscribers: davide, mehdi_amini, eraman, fhahn, llvm-commits
Differential Revision: https://reviews.llvm.org/D35608
llvm-svn: 309158
Going through the Constant methods requires redetermining that the Constant is a ConstantInt and then calling isZero/isOne/isMinusOne.
llvm-svn: 307292
This patch does an inline expansion of memcmp.
It changes the memcmp library call into an inline expansion when the size is
known at compile time and is under a target specified threshold.
This expansion is implemented in CodeGenPrepare and expands into straight line
code. The target specifies a maximum load size and the expansion works by using
this size to load the two sources, compare, and exit early if a difference is
found. It also has a special case when the memcmp result is used in a compare
to zero equality.
Differential Revision: https://reviews.llvm.org/D28637
llvm-svn: 304313
This continues the changes started when computeSignBit was replaced with this new version of computeKnowBits.
Differential Revision: https://reviews.llvm.org/D33431
llvm-svn: 303773
Refactor the strlen optimization code to work for both strlen and wcslen.
This especially helps with programs in the wild where people pass
L"string"s to const std::wstring& function parameters and the wstring
constructor gets inlined.
This also fixes a lingerind API problem/bug in getConstantStringInfo()
where zeroinitializers would always give you an empty string (without a
length) back regardless of the actual length of the initializer which
did not work well in the TrimAtNul==false causing the PR mentioned
below.
Note that the fixed getConstantStringInfo() needed fixes to SelectionDAG
memcpy lowering and may lead to some cases for out-of-bounds
zeroinitializer accesses not getting optimized anymore. So some code
with UB may produce out of bound memory reads now instead of just
producing zeros.
The refactoring "accidentally" fixes http://llvm.org/PR32124
Differential Revision: https://reviews.llvm.org/D32839
llvm-svn: 303461
Use variadic templates instead of relying on <cstdarg> + sentinel.
This enforces better type checking and makes code more readable.
Differential Revision: https://reviews.llvm.org/D32541
llvm-svn: 302571
Summary:
Do three things to help with that:
- Add AttributeList::FirstArgIndex, which is an enumerator currently set
to 1. It allows us to change the indexing scheme with fewer changes.
- Add addParamAttr/removeParamAttr. This just shortens addAttribute call
sites that would otherwise need to spell out FirstArgIndex.
- Remove some attribute-specific getters and setters from Function that
take attribute list indices. Most of these were only used from
BuildLibCalls, and doesNotAlias was only used to test or set if the
return value is malloc-like.
I'm happy to split the patch, but I think they are probably easier to
review when taken together.
This patch should be NFC, but it sets the stage to change the indexing
scheme to this, which is more convenient when indexing into an array:
0: func attrs
1: retattrs
2...: arg attrs
Reviewers: chandlerc, pete, javed.absar
Subscribers: david2050, llvm-commits
Differential Revision: https://reviews.llvm.org/D32811
llvm-svn: 302060
This patch introduces a new KnownBits struct that wraps the two APInt used by computeKnownBits. This allows us to treat them as more of a unit.
Initially I've just altered the signatures of computeKnownBits and InstCombine's simplifyDemandedBits to pass a KnownBits reference instead of two separate APInt references. I'll do similar to the SelectionDAG version of computeKnownBits/simplifyDemandedBits as a separate patch.
I've added a constructor that allows initializing both APInts to the same bit width with a starting value of 0. This reduces the repeated pattern of initializing both APInts. Once place default constructed the APInts so I added a default constructor for those cases.
Going forward I would like to add more methods that will work on the pairs. For example trunc, zext, and sext occur on both APInts together in several places. We should probably add a clear method that can be used to clear both pieces. Maybe a method to check for conflicting information. A method to return (Zero|One) so we don't write it out everywhere. Maybe a method for (Zero|One).isAllOnesValue() to determine if all bits are known. I'm sure there are many other methods we can come up with.
Differential Revision: https://reviews.llvm.org/D32376
llvm-svn: 301432
One of the fast-math optimizations is to replace calls to standard double
functions with their float equivalents, e.g. exp -> expf. However, this can
cause infinite loops for the following:
float expf(float val) { return (float) exp((double) val); }
A similar inline declaration exists in the MinGW-w64 math.h header file which
when compiled with -O2/3 and fast-math generates infinite loops.
So this fix checks that the calling function to the standard double function
that is being replaced does not match the float equivalent.
Differential Revision: https://reviews.llvm.org/D31806
llvm-svn: 301304
From a user prospective, it forces the use of an annoying nullptr to mark the end of the vararg, and there's not type checking on the arguments.
The variadic template is an obvious solution to both issues.
Differential Revision: https://reviews.llvm.org/D31070
llvm-svn: 299949
Module::getOrInsertFunction is using C-style vararg instead of
variadic templates.
From a user prospective, it forces the use of an annoying nullptr
to mark the end of the vararg, and there's not type checking on the
arguments. The variadic template is an obvious solution to both
issues.
llvm-svn: 299925
Module::getOrInsertFunction is using C-style vararg instead of
variadic templates.
From a user prospective, it forces the use of an annoying nullptr
to mark the end of the vararg, and there's not type checking on the
arguments. The variadic template is an obvious solution to both
issues.
Patch by: Serge Guelton <serge.guelton@telecom-bretagne.eu>
Differential Revision: https://reviews.llvm.org/D31070
llvm-svn: 299699
A common way to implement nearbyint is by fiddling with the floating
point environment and calling rint. This is used at least by the BSD
libm and musl. As such, canonicalizing the latter to the former will
create infinite loops for libm and generally pessimize performance, at
least when the generic C versions are used.
This change preserves the rint in the libcall translation and also
handles the domain truncation logic, so that rint with float argument
will be reduced to rintf etc.
llvm-svn: 299247
Summary:
This class is a list of AttributeSetNodes corresponding the function
prototype of a call or function declaration. This class used to be
called ParamAttrListPtr, then AttrListPtr, then AttributeSet. It is
typically accessed by parameter and return value index, so
"AttributeList" seems like a more intuitive name.
Rename AttributeSetImpl to AttributeListImpl to follow suit.
It's useful to rename this class so that we can rename AttributeSetNode
to AttributeSet later. AttributeSet is the set of attributes that apply
to a single function, argument, or return value.
Reviewers: sanjoy, javed.absar, chandlerc, pete
Reviewed By: pete
Subscribers: pete, jholewinski, arsenm, dschuff, mehdi_amini, jfb, nhaehnle, sbc100, void, llvm-commits
Differential Revision: https://reviews.llvm.org/D31102
llvm-svn: 298393
Summary:
Some frontends emit a speculate-and-select idiom for sqrt, wherein they compute
sqrt(x), check if x is negative, and select NaN if it is:
%cmp = fcmp olt double %a, -0.000000e+00
%sqrt = call double @llvm.sqrt.f64(double %a)
%ret = select i1 %cmp, double 0x7FF8000000000000, double %sqrt
This is technically UB as the LangRef is written today if %a is ever less than
-0. But emitting code that's compliant with the current definition of sqrt
would require a branch, which would then prevent us from matching this idiom in
SelectionDAG (which we do today -- ISD::FSQRT has defined behavior on negative
inputs), because SelectionDAG looks at one BB at a time.
Nothing in LLVM takes advantage of this undefined behavior, as far as we can
tell, and the fact that llvm.sqrt has UB dates from its initial addition to the
LangRef.
Reviewers: arsenm, mehdi_amini, hfinkel
Subscribers: wdng, llvm-commits
Differential Revision: https://reviews.llvm.org/D28797
llvm-svn: 293242
Summary:
The LibFunc::Func enum holds enumerators named for libc functions.
Unfortunately, there are real situations, including libc implementations, where
function names are actually macros (musl uses "#define fopen64 fopen", for
example; any other transitively visible macro would have similar effects).
Strictly speaking, a conforming C++ Standard Library should provide any such
macros as functions instead (via <cstdio>). However, there are some "library"
functions which are not part of the standard, and thus not subject to this
rule (fopen64, for example). So, in order to be both portable and consistent,
the enum should not use the bare function names.
The old enum naming used a namespace LibFunc and an enum Func, with bare
enumerators. This patch changes LibFunc to be an enum with enumerators prefixed
with "LibFFunc_". (Unfortunately, a scoped enum is not sufficient to override
macros.)
There are additional changes required in clang.
Reviewers: rsmith
Subscribers: mehdi_amini, mzolotukhin, nemanjai, llvm-commits
Differential Revision: https://reviews.llvm.org/D28476
llvm-svn: 292848
Add missing fabs(fpext) optimzation that worked with the call,
and also fixes it creating a second fpext when there were multiple
uses.
llvm-svn: 292172
fabs(x * x) is not generally safe to assume x is positive if x is a NaN.
This is also less general than it could be, so this will be replaced
with a transformation on the intrinsic.
llvm-svn: 291359
Lowering to llvm.cttz() will result in constant folding anyway
if the argument to ffs is a constant. Pointed out by Eli for
fls() in D14590.
llvm-svn: 289888
After r289755, the AssumptionCache is no longer needed. Variables affected by
assumptions are now found by using the new operand-bundle-based scheme. This
new scheme is more computationally efficient, and also we need much less
code...
llvm-svn: 289756
At least the plugin used by the LibreOffice build
(<https://wiki.documentfoundation.org/Development/Clang_plugins>) indirectly
uses those members (through inline functions in LLVM/Clang include files in turn
using them), but they are not exported by utils/extract_symbols.py on Windows,
and accessing data across DLL/EXE boundaries on Windows is generally
problematic.
Differential Revision: https://reviews.llvm.org/D26671
llvm-svn: 289647
Teach SimplifyLibcalls that in can treat functions annotated with
apcs, aapcs or aapcs_vfp like normal C functions if they only take
and return integer or pointer values, and the target is not iOS.
Differential Revision: https://reviews.llvm.org/D24453
llvm-svn: 281322
Hal pointed out that the semantic of our intrinsic and the libc
call are slightly different. Add a comment while I'm here to
explain why we can't emit an intrinsic. Thanks Hal!
llvm-svn: 278200