The tests with a constant sub operand were added with rL338143,
but the potential transform doesn't have that requirement, so
adding more tests with variable operands.
llvm-svn: 338150
The note is added in the following situation:
- We are throwing a nullability-related warning on an IVar
- The path goes through a method which *could have* (syntactically
determined) written into that IVar, but did not
rdar://42444460
Differential Revision: https://reviews.llvm.org/D49689
llvm-svn: 338149
There was a missing check for if a candidate list was entirely deleted. This
adds that check.
This fixes an asan failure caused by running test/CodeGen/AArch64/addsub_ext.ll
with the MachineOutliner enabled.
llvm-svn: 338148
This feature enables the fusion of such operations on Cortex A57 and Cortex
A72, as recommended in their Software Optimisation Guides, sections 4.14 and
4.11, respectively.
Differential revision: https://reviews.llvm.org/D49563
llvm-svn: 338147
Fix the order of callbacks related to the taskloop construct.
Add the iteration_count to work callbacks (according to the spec).
Use kmpc_omp_task() instead of kmp_omp_task() to include OMPT callbacks.
Add a testcase.
Patch by Simon Convent
Reviewed by: protze.joachim, hbae
Subscribers: openmp-commits
Differential Revision: https://reviews.llvm.org/D47709
llvm-svn: 338146
The ompt/tasks/task_types.c testcase did not test untied tasks properly. Now,
frame addresses are tested and two scheduling points are added at which the
task can switch to another thread. Due to scheduling effects, the frame address
could be NULL.
This needed a restructure of the way OMPT callbacks are called.
__ompt_task_finish() now as an extra parameter, whether a task is completed.
Its invocation has been moved into __kmp_task_finish(). Thus, the order of the
writes to the frame addresses is not subject to scheduling effects anymore.
Patch by Simon Convent
Reviewed by: protze.joachim, hbae
Subscribers: openmp-commits
Differential Revision: https://reviews.llvm.org/D49181
llvm-svn: 338145
The two more outputs are needed to match the return addresses when using the
Intel Compiler, as it generates more instructions between the fuzzy-printing
of the address and the runtime call.
Patch by Simon Convent
Reviewed By: protze.joachim, hbae
Differential Revision: https://reviews.llvm.org/D49373
llvm-svn: 338144
ObjCIvarExpr is *not* a subclass of MemberExpr, and a separate matcher
is required to support it.
Adding a hasDeclaration support as well, as it's not very useful without
it.
Differential Revision: https://reviews.llvm.org/D49701
llvm-svn: 338140
ObjCIvarExpr is *not* a subclass of MemberExpr, and a separate matcher
is required to support it.
Adding a hasDeclaration support as well, as it's not very useful without
it.
Differential Revision: https://reviews.llvm.org/D49701
llvm-svn: 338137
in some member function calls.
Specifically, when calling a conversion function, we would fail to
create the AST node representing materialization of the class object.
llvm-svn: 338135
Errors like the following are reported by:
https://urldefense.proofpoint.com/v2/url?u=http-3A__lab.llvm.org-3A8011_builders_llvm-2Dclang-2Dx86-5F64-2Dexpensive-2Dchecks-2Dwin_builds_11261&d=DwIBAg&c=5VD0RTtNlTh3ycd41b3MUw&r=DA8e1B5r073vIqRrFz7MRA&m=929oWPCf7Bf2qQnir4GBtowB8ZAlIRWsAdTfRkDaK-g&s=9k-wbEUVpUm474hhzsmAO29VXVvbxJPWD9RTgCD71fQ&e=
*** Bad machine code: Explicit definition marked as use ***
- function: cal_align1
- basic block: %bb.0 entry (0x47edd98)
- instruction: LDB $r3, $r2, 0
- operand 0: $r3
This is because RegState info was missing for ScratchReg inside
expandMEMCPY. This caused incomplete register usage information to
MachineInstr verifier which then would complain as there could be potential
code-gen issue if the complained MachineInstr is used in place where
register usage information matters even though the memcpy expanding is not
in such case as it happens at the last stage of IR optimization pipeline.
We should always specify those register usage information which compiler
couldn't deduct automatically whenever we add a hardware register manually.
Reported-by: Builder llvm-clang-x86_64-expensive-checks-win Build #11261
Signed-off-by: Jiong Wang <jiong.wang@netronome.com>
Reviewed-by: Yonghong Song <yhs@fb.com>
llvm-svn: 338134
This patch enables the MachineOutliner by default in AArch64 under -Oz.
The MachineOutliner offers around a 4.5% improvement on the current -Oz code
size improvements.
We have done work into improving the debuggability of outlined code, so that
users of -Oz won't be surprised by the optimization. We have also been executing
the LLVM test suite and common external tests such as the SPEC suites
continuously with no issue. The outliner has a low compile-time overhead of
roughly 1%. At this point, the outliner would be a really good addition to the
-Oz pass pipeline!
llvm-svn: 338133
This is a follow-up suggested in D48970.
Alive proofs:
https://rise4fun.com/Alive/sII
We can eliminate an instruction in the usual select-of-constants
to bit hack transform by adjusting the add/sub with constant.
This is always a win.
There are more transforms that are likely wins, but they may need
target hooks in case some targets do not benefit.
This is another step towards making up for canonicalizing to
select-of-constants in rL331486.
llvm-svn: 338132
R600 can't handle immediates for BFE, these will be eliminated later.
Fixes powr/pow regressions n r600 since r334817
Differential Revision: https://reviews.llvm.org/D49641
llvm-svn: 338127
This patch adds support for various integer reduction operations:
SADDV signed add reduction to scalar
UADDV unsigned add reduction to scalar
SMAXV signed maximum reduction to scalar
SMINV signed minimum reduction to scalar
UMAXV unsigned maximum reduction to scalar
UMINV unsigned minimum reduction to scalar
ANDV logical AND reduction to scalar
ORV logical OR reduction to scalar
EORV logical EOR reduction to scalar
The reduction is predicated, e.g.
smaxv s0, p0, z1.s
performs a signed maximum reduction on active elements in z1,
and stores the (signed max value) result in s0.
llvm-svn: 338126
Summary: See the test case for a repro.
Reviewers: juliehockett, ioeric, hokein, aaron.ballman
Reviewed By: hokein
Subscribers: lebedev.ri, xazax.hun, cfe-commits
Differential Revision: https://reviews.llvm.org/D49862
llvm-svn: 338124
This patch adds support for various floating-point
reduction operations:
FADDA strictly-ordered add reduction, accumulating in scalar
FADDV recursive add reduction to scalar
FMAXV recursive max reduction to scalar
FMINV recursive min reduction to scalar
FMAXNMV recursive max number reduction to scalar
FMINNMV recursive min number reduction to scalar
The reduction is predicated, e.g.
fadda d0, p0, d0, z1.d
performs the add-reduction in strict order on active elements
in z1, accumulating into d0.
faddv d0, p0, z1.d
performs the add-reduction (not in strict order)
on active elements in z1, storing the result in d0.
llvm-svn: 338123
Summary:
This commit introduces a new macro, _LIBCPP_HIDE_FROM_ABI, whose goal is to
mark functions that shouldn't be part of libc++'s ABI. It marks the functions
as being hidden for dylib visibility purposes, and as having internal linkage
using Clang's __attribute__((internal_linkage)) when available, and
__always_inline__ otherwise.
It replaces _LIBCPP_INLINE_VISIBILITY, which was always using __always_inline__
to achieve similar goals, but suffered from debuggability and code size problems.
The full proposal, along with more background information, can be found here:
http://lists.llvm.org/pipermail/cfe-dev/2018-July/058419.html
This commit does not rename uses of _LIBCPP_INLINE_VISIBILITY to
_LIBCPP_HIDE_FROM_ABI: this wide reaching but mechanical change can
be done later when we've confirmed we're happy with the new macro.
In the future, it would be nice if we could optionally allow dropping
any internal_linkage or __always_inline__ attribute, which could result
in code size improvements. However, this is currently impossible for
reasons explained here: http://lists.llvm.org/pipermail/cfe-dev/2018-July/058450.html
Reviewers: EricWF, dexonsmith, mclow.lists
Subscribers: christof, dexonsmith, llvm-commits, mclow.lists
Differential Revision: https://reviews.llvm.org/D49240
llvm-svn: 338122
This patch adds support for transcendental acceleration
instructions 'FEXPA' (exponential accelerator) and 'FTSSEL'
(trigonometric select coefficient).
llvm-svn: 338121
Summary:
As it was, always exporting LLVM_LINK_LLVM_DYLIB caused out-of-tree
clients to lose the ability to link against the dylib, even if in-tree tools did
not. By only exporting the setting if it is enabled, out-of-tree clients get the
correct default, but may still choose if they can.
Reviewers: mgorny, beanz, labath, bogner, chandlerc
Reviewed By: bogner, chandlerc
Subscribers: bollu, llvm-commits
Differential Revision: https://reviews.llvm.org/D49843
llvm-svn: 338119
The tests with constants show a missing optimization.
Analysis for adds is better than subs, so this can also
help with other transforms. And codegen is better with
adds for targets like x86 (destructive ops, no sub-from).
https://rise4fun.com/Alive/llK
llvm-svn: 338118
The original Dex Iterators patch (https://reviews.llvm.org/rL338017)
caused problems for Clang 3.6 and Clang 3.7 due to the compiler bug
which prevented inferring template parameter (`Size`) in create(And|Or)?
functions. It was reverted in https://reviews.llvm.org/rL338054.
In this revision the mentioned helper functions were replaced with
variadic templated versions.
Proposed changes were tested on multiple compiler versions, including
Clang 3.6 which originally caused the failure.
llvm-svn: 338116
This is a follow-up for the patch rL335020. When we replace compares against
trunc with compares against wide IV, we can also replace signed predicates with
unsigned where it is legal.
Reviewed By: reames
Differential Revision: https://reviews.llvm.org/D48763
llvm-svn: 338115
Summary:
As discussed in IRC with @rsmith, it is slightly not good to keep that in the `CastExpr` itself:
Given the explicit cast, which is represented in AST as an `ExplicitCastExpr` + `ImplicitCastExpr`'s,
only the `ImplicitCastExpr`'s will be marked as `PartOfExplicitCast`, but not the `ExplicitCastExpr` itself.
Thus, it is only ever `true` for `ImplicitCastExpr`'s, so we don't need to write/read/dump it for `ExplicitCastExpr`'s.
We don't need to worry that we write the `PartOfExplicitCast` in PCH after `CastExpr::path_iterator`,
since the `ExprImplicitCastAbbrev` is only used when the `NumBaseSpecs == 0`, i.e. there is no 'path'.
Reviewers: rsmith, rjmccall, erichkeane, aaron.ballman
Reviewed By: rsmith, erichkeane
Subscribers: vsk, cfe-commits, rsmith
Tags: #clang
Differential Revision: https://reviews.llvm.org/D49838
llvm-svn: 338108
This commit includes unit tests for D48828, which enhances InstSimplify to enable jump threading with a method whose return type is std::pair<int, bool> or std::pair<bool, int>.
I am going to commit the actual transformation later.
llvm-svn: 338107
Not sure why they were being explicitly excluded, but I believe all the math inside the if works. I changed the absolute value to be uint64_t instead of int64_t so INT64_MIN+1 wouldn't be signed wrap.
llvm-svn: 338101
Summary:
Some links were failing with "Global is external, but doesn't have
external or weak linkage!" in ThinLTO builds with debug
information. This happened when we elide the body of a global that is
referenced by debug info. This results in a declaration, which we
would then internalize - but declarations cannot be internal. This
change avoids the problem by not internalizing these declarations.
Fixes PR38046.
Reviewers: pcc, tejohnson
Subscribers: mehdi_amini, aprantl, hiraditya, JDevlieghere, steven_wu, dexonsmith, llvm-commits
Differential Revision: https://reviews.llvm.org/D49777
llvm-svn: 338100