This primarily affects add/fadd/mul/fmul/and/or/xor/pmuludq/pmuldq/max/min/fmaxc/fminc/pmaddwd/pavg.
We already commuted the unmasked and zero masked versions.
I've added 512-bit stack folding tests for most of the instructions
affected. I've tested needing commuting and not commuting across
unmasked, merged masked, and zero masked. The 128/256 bit instructions
should behave similarly.
llvm-svn: 362746
Summary: This creates an integration test for global constants
Reviewers: rnk
Subscribers: llvm-commits
Tags: #llvm
Differential Revision: https://reviews.llvm.org/D62974
llvm-svn: 362745
Summary:
Other macros are used to declare namespaces, and should thus be handled
similarly. This is the case for crpcut's TESTSUITE macro, or for
unittest-cpp's SUITE macro:
TESTSUITE(Foo) {
TEST(MyFirstTest) {
assert(0);
}
} // TESTSUITE(Foo)
This patch deals with this cases by introducing a new option to specify
lists of namespace macros. Internally, it re-uses the system already in
place for foreach and statement macros, to ensure there is no impact on
performance.
Reviewers: krasimir, djasper, klimek
Reviewed By: klimek
Subscribers: acoomans, cfe-commits, klimek
Tags: #clang
Differential Revision: https://reviews.llvm.org/D37813
llvm-svn: 362740
A function for loop vectorization illegality reporting has been
introduced:
void LoopVectorizationLegality::reportVectorizationFailure(
const StringRef DebugMsg, const StringRef OREMsg,
const StringRef ORETag, Instruction * const I) const;
The function prints a debug message when the debug for the compilation
unit is enabled as well as invokes the optimization report emitter to
generate a message with a specified tag. The function doesn't cover any
complicated logic when a custom lambda should be passed to the emitter,
only generating a message with a tag is supported.
The function always prints the instruction `I` after the debug message
whenever the instruction is specified, otherwise the debug message
ends with a dot: 'LV: Not vectorizing: Disabled/already vectorized.'
Patch by Pavel Samolysov <samolisov@gmail.com>
llvm-svn: 362736
Summary:
(1) Function descriptor on AIX
On AIX, a called routine may have 2 distinct symbols associated with it:
* A function descriptor (Name)
* A function entry point (.Name)
The descriptor structure on AIX is the same as those in the ELF V1 ABI:
* The address of the entry point of the function.
* The TOC base address for the function.
* The environment pointer.
The descriptor symbol uses the same name as the source level function in C.
The function entry point is analogous to the symbol we would generate for a
function in a non-descriptor-based ABI, except that it is renamed by
prepending a ".".
Which symbol gets referenced depends on the context:
* Taking the address of the function references the descriptor symbol.
* Calling the function references the entry point symbol.
(2) Speaking of implementation on AIX, for direct function call target, we
create proper MCSymbol SDNode(e.g . ".foo") while constructing SDAG to
replace original TargetGlobalAddress SDNode. Then down the path, we can
take advantage of this MCSymbol.
Patch by: Xiangling_L
Reviewed by: sfertile, hubert.reinterpretcast, jasonliu, syzaara
Differential Revision: https://reviews.llvm.org/D62532
llvm-svn: 362735
This adds support for unary fneg based on the implementation of BinaryOperator without the soft float FP cost.
Previously we would just delegate to visitUnaryInstruction. I think the only real change is that we will pass the FastMath flags to SimplifyFNeg now.
Differential Revision: https://reviews.llvm.org/D62699
llvm-svn: 362732
This is a really silly bug that even a simple test w/an unconditional latch would have caught. I tried to guard against the case, but put it in the wrong if check. Oops.
llvm-svn: 362727
This patch is the first step towards ensuring MergeConsecutiveStores correctly handles non-temporal loads\stores:
1 - When merging load\stores we must ensure that they all have the same non-temporal flag. This is unlikely to occur, but can in strange cases where we're storing at the end of one page and the beginning of another.
2 - The merged load\store node must retain the non-temporal flag.
Differential Revision: https://reviews.llvm.org/D62910
llvm-svn: 362723
Many -static/-no-pie/-shared/-pie applications linked against glibc or musl
should work with this patch. This also helps FreeBSD PowerPC64 to migrate
their lib32 (PR40888).
* Fix default image base and max page size.
* Support new-style Secure PLT (see below). Old-style BSS PLT is not
implemented, so it is not suitable for FreeBSD rtld now because it doesn't
support Secure PLT yet.
* Support more initial relocation types:
R_PPC_ADDR32, R_PPC_REL16*, R_PPC_LOCAL24PC, R_PPC_PLTREL24, and R_PPC_GOT16.
The addend of R_PPC_PLTREL24 is special: it decides the call stub PLT type
but it should be ignored for the computation of target symbol VA.
* Support GNU ifunc
* Support .glink used for lazy PLT resolution in glibc
* Add a new thunk type: PPC32PltCallStub that is similar to PPC64PltCallStub.
It is used by R_PPC_REL24 and R_PPC_PLTREL24.
A PLT stub used in -fPIE/-fPIC usually loads an address relative to
.got2+0x8000 (-fpie/-fpic code uses _GLOBAL_OFFSET_TABLE_ relative
addresses).
Two .got2 sections in two object files have different addresses, thus a PLT stub
can't be shared by two object files. To handle this incompatibility,
change the parameters of Thunk::isCompatibleWith to
`const InputSection &, const Relocation &`.
PowerPC psABI specified an old-style .plt (BSS PLT) that is both
writable and executable. Linkers don't make separate RW- and RWE segments,
which causes all initially writable memory (think .data) executable.
This is a big security concern so a new PLT scheme (secure PLT) was developed to
address the security issue.
TLS will be implemented in D62940.
glibc older than ~2012 requires .rela.dyn to include .rela.plt, it can
not handle the DT_RELA+DT_RELASZ == DT_JMPREL case correctly. A hack
(not included in this patch) in LinkerScript.cpp addOrphanSections() to
work around the issue:
if (Config->EMachine == EM_PPC) {
// Older glibc assumes .rela.dyn includes .rela.plt
Add(In.RelaDyn);
if (In.RelaPlt->isLive() && !In.RelaPlt->Parent)
In.RelaDyn->getParent()->addSection(In.RelaPlt);
}
Reviewed By: ruiu
Differential Revision: https://reviews.llvm.org/D62464
llvm-svn: 362721
Summary: Dependence Analysis performs static checks to confirm validity
of delinearization. These checks often fail for 64-bit targets due to
type conversions and integer wrapping that prevent simplification of the
SCEV expressions. These checks would also fail at compile-time if the
lower bound of the loops are compile-time unknown.
For example:
void foo(int n, int m, int a[][m]) {
for (int i = 0; i < n; ++i)
for (int j = 0; j < m; ++j) {
a[i][j] = a[i+1][j-2];
}
}
opt -mem2reg -instcombine -indvars -loop-simplify -loop-rotate -inline
-pass-remarks=.* -debug-pass=Arguments
-da-permissive-validity-checks=false k3.ll -analyze -da
will produce the following by default:
da analyze - anti [* *|<]!
but will produce the following expected dependence vector if the
validity checks are disabled:
da analyze - consistent anti [1 -2]!
This revision will introduce a debug option that will leave the validity
checks in place by default, but allow them to be turned off. New tests
are added for cases where it cannot be proven at compile-time that the
individual subscripts stay in-bound with respect to a particular
dimension of an array. These tests enable the option to provide user
guarantee that the subscripts do not over/under-flow into other
dimensions, thereby producing more accurate dependence vectors.
For prior discussion on this topic, leading to this change, please see
the following thread:
http://lists.llvm.org/pipermail/llvm-dev/2019-May/132372.html
Reviewers: Meinersbur, jdoerfert, kbarton, dmgreen, fhahn
Reviewed By: Meinersbur, jdoerfert, dmgreen
Subscribers: fhahn, hiraditya, javed.absar, llvm-commits, Whitney,
etiotto
Tag: LLVM
Differential Revision: https://reviews.llvm.org/D62610
llvm-svn: 362711
Summary:
This patch implements SDAG call lowering on AIX for functions
which only have parameters that could fit into GPRs.
Reviewers: hubert.reinterpretcast, syzaara
Differential Revision: https://reviews.llvm.org/D62823
llvm-svn: 362708
Summary: `change()` is an all purpose function; the revision adds simple shortcuts for the specific operations of inserting (before/after) or removing source.
Reviewers: ilya-biryukov
Subscribers: cfe-commits
Tags: #clang
Differential Revision: https://reviews.llvm.org/D62621
llvm-svn: 362707
Summary:
This patch is part of a patch series to add support for FileCheck
numeric expressions. This specific patch introduces support for defining
numeric variable in a CHECK directive.
This commit introduces support for defining numeric variable from a
litteral value in the input text. Numeric expressions can then use the
variable provided it is on a later line.
Copyright:
- Linaro (changes up to diff 183612 of revision D55940)
- GraphCore (changes in later versions of revision D55940 and
in new revision created off D55940)
Reviewers: jhenderson, chandlerc, jdenny, probinson, grimar, arichardson, rnk
Subscribers: hiraditya, llvm-commits, probinson, dblaikie, grimar, arichardson, tra, rnk, kristina, hfinkel, rogfer01, JonChesterfield
Tags: #llvm
Differential Revision: https://reviews.llvm.org/D60386
llvm-svn: 362705
This patch implements the "CREATE_THIN" MRI script command, allowing thin archives to be created via MRI scripts.
Differential Revision: https://reviews.llvm.org/D62919
llvm-svn: 362704
Instantiate a ClangTidyDiagnosticConsumer also for the plugin case and
let it forward the diagnostics to the external diagnostic engine that is
already in place.
One minor difference to the clang-tidy executable case is that the
compiler checks/diagnostics are referred to with their original name.
For example, for -Wunused-variable the plugin will refer to the check as
"-Wunused-variable" while the clang-tidy executable will refer to that
as "clang-diagnostic- unused-variable". This is because the compiler
diagnostics never reach ClangTidyDiagnosticConsumer.
Differential Revision: https://reviews.llvm.org/D61487
llvm-svn: 362702
Summary:
The assertion "isIntegerConstantExpr" is triggered in the
isIntegerConstantExpr(), we should not call it if the expression is value
dependent.
Reviewers: gribozavr
Subscribers: xazax.hun, cfe-commits
Tags: #clang
Differential Revision: https://reviews.llvm.org/D62947
llvm-svn: 362701
This patch is a follow up for D62018 to add lrint/llrint
support for float16.
Reviewed By: SjoerdMeijer
Differential Revision: https://reviews.llvm.org/D62863
llvm-svn: 362700
This patch is a follow up for D61391 to add lround/llround
support for float16.
Reviewed By: SjoerdMeijer
Differential Revision: https://reviews.llvm.org/D62861
llvm-svn: 362698