The LRV and STRV nodes carry an extra operand to indicate the
type of the memory access. This is redundant, since the nodes
are actually of class MemIntrinsicNode and therefore hold that
same information already as MemoryVT.
NFC intended.
llvm-svn: 345618
Correct costings of SK_ExtractSubvector requires the SubTy argument to indicate the type/size of the extracted subvector.
Unlike the rest of the shuffle kinds this means that the main Ty argument represents the source vector type not the destination!
I've done my best to fix a number of vectorizer uses:
SLP - the reduction epilogue costs should be using a SK_PermuteSingleSrc shuffle as these all occur at the hardware vector width - we're not extracting (illegal) subvector types. This is causing the cost model diffs as SK_ExtractSubvector costs are poorly handled and tend to just return 1 at the moment.
LV - I'm not clear on what the SK_ExtractSubvector should represents for recurrences - I've used a <1 x ?> subvector extraction as that seems to match the VF delta.
Differential Revision: https://reviews.llvm.org/D53573
llvm-svn: 345617
Summary: --keep-global-symbol and --globalize-symbol don't make sense for undefined symbols, so it should be ignored for those symbols. This matches GNU objcopy behavior.
Reviewers: jhenderson, alexshap, jakehehrlich, espindola
Reviewed By: jhenderson, jakehehrlich
Subscribers: emaste, arichardson, llvm-commits
Differential Revision: https://reviews.llvm.org/D53733
llvm-svn: 345614
Summary: This allows to remove `using namespace llvm;` in those *.cpp files
When we want to revisit the decision (everything resides in llvm::mca::*) in the future, we can move things to a nested namespace of llvm::mca::, to conceptually make them separate from the rest of llvm::mca::*
Reviewers: andreadb, mattd
Reviewed By: andreadb
Subscribers: javed.absar, tschuett, gbedwell, llvm-commits
Differential Revision: https://reviews.llvm.org/D53407
llvm-svn: 345612
Summary:
Added benchmarks for Construct, Copy, Move, Destroy, Relationals and
Read. On the ones that matter, the benchmarks tests hot and cold data,
and opaque and transparent inputs.
Reviewers: EricWF
Subscribers: christof, ldionne, libcxx-commits
Differential Revision: https://reviews.llvm.org/D53825
llvm-svn: 345611
Summary:
The macro may not have location (or more generally, the location may not exist),
e.g. if it originates from compiler's command-line.
The check complains on all the macros, even those without the location info.
Which means, it only says it does not like it. What is 'it'? I have no idea.
If we don't print the name, then there is no way to deal with that situation.
And in general, not printing name here forces the user to try to understand,
given, the macro definition location, what is the macro name?
This isn't fun.
Also, ignores-by-default the macros originating from command-line,
with an option to not ignore those.
I suspect some more issues may crop up later.
Reviewers: JonasToth, aaron.ballman, hokein, xazax.hun, alexfh
Reviewed By: JonasToth, aaron.ballman
Subscribers: nemanjai, kbarton, rnkovacs, cfe-commits
Tags: #clang-tools-extra
Differential Revision: https://reviews.llvm.org/D53817
llvm-svn: 345610
Added support for mapping of lambdas in the target regions. It scans all
the captures by reference in the lambda, implicitly maps those variables
in the target region and then later reinstate the addresses of
references in lambda to the correct addresses of the captured|privatized
variables.
llvm-svn: 345609
Summary:
Added support for correct mapping of variables captured by reference in
lambdas. That kind of mapping may appear only in target-executable
regions and must follow the original lambda or another lambda capture
for the same lambda.
The expected data: base address - the address of the lambda, begin
pointer - pointer to the address of the lambda capture, size - size of
the captured variable.
When OMP_TGT_MAPTYPE_PTR_AND_OBJ mapping type is seen in
target-executable region, the target address of the last processed item
is taken as the address of the original lambda `tgt_lambda_ptr`. Then,
the pointer to capture on the device is calculated like `tgt_lambda_ptr
+ (host_begin_pointer - host_begin_base)` and the target-based address
of the original variable (which host address is
`*(void**)begin_pointer`) is written to that pointer.
Reviewers: kkwli0, gtbercea, grokos
Subscribers: openmp-commits
Differential Revision: https://reviews.llvm.org/D51107
llvm-svn: 345608
shuffle (insert ?, Scalar, IndexC), V1, Mask --> insert V1, Scalar, IndexC'
The motivating case is at least a couple of steps away: I noticed that
SLPVectorizer does not analyze shuffles as well as sequences of
insert/extract in PR34724:
https://bugs.llvm.org/show_bug.cgi?id=34724
...so SLP may fail to vectorize when source code has shuffles to start
with or instcombine has converted insert/extract to shuffles.
Independent of that, an insertelement is always a simpler op for IR
analysis vs. a shuffle, so we should transform to insert when possible.
I don't think there's any codegen concern here - if a target can't insert
a scalar directly to some fixed element in a vector (x86?), then this
should get expanded to the insert+shuffle that we started with.
Differential Revision: https://reviews.llvm.org/D53507
llvm-svn: 345607
The SchedModel allows the addition of ReadAdvances to express that certain
operands of the instructions are needed at a later point than the others.
RegAlloc may add pseudo operands that are not part of the instruction
descriptor, and therefore cannot have any read advance entries. This meant
that in some cases the desired read advance was nullified by such a pseudo
operand, which still had the original latency.
This patch fixes this by making sure that such pseudo operands get a zero
latency during DAG construction.
Review: Matthias Braun, Ulrich Weigand.
https://reviews.llvm.org/D49671
llvm-svn: 345606
Only store the NRVO candidate if needed in ReturnStmt.
A good chuck of all of the ReturnStmt have no NRVO candidate
(more than half when parsing all of Boost). For all of them
this saves one pointer. This has no impact on children().
Differential Revision: https://reviews.llvm.org/D53716
Reviewed By: rsmith
llvm-svn: 345605
This commit is a combination of two patches:
* "Fix in getScalarizationOverhead()"
If target returns false in TTI.prefersVectorizedAddressing(), it means the
address registers will not need to be extracted. Therefore, there should
be no operands scalarization overhead for a load instruction.
* "Don't pass the instruction pointer from getMemInstScalarizationCost."
Since VF is always > 1, this is a cost query for an instruction in the
vectorized loop and it should not be evaluated within the scalar
context of the instruction.
Review: Ulrich Weigand, Hal Finkel
https://reviews.llvm.org/D52351https://reviews.llvm.org/D52417
llvm-svn: 345603
Narrowing vector binops came up in the demanded bits discussion in D52912.
I don't think we're going to be able to do this transform in IR as a canonicalization
because of the risk of creating unsupported widths for vector ops, but we already have
a DAG TLI hook to allow what I was hoping for: isExtractSubvectorCheap(). This is
currently enabled for x86, ARM, and AArch64 (although only x86 has existing regression
test diffs).
This is artificially limited to not look through bitcasts because there are so many
test diffs already, but that's marked with a TODO and is a small follow-up.
Differential Revision: https://reviews.llvm.org/D53784
llvm-svn: 345602
Don't store the data for the condition variable if not needed.
This cuts the size of WhileStmt by up to a pointer.
The order of the children is kept the same.
Differential Revision: https://reviews.llvm.org/D53715
Reviewed By: rjmccall
llvm-svn: 345597
Sub, SDiv and UDiv are not commutative, so only the RHS operand can fold a
load. This patch adds a check for this.
Review: Ulrich Weigand
https://reviews.llvm.org/D53791
llvm-svn: 345596
Summary: So we can keep that not-so-great logic in one place.
Reviewers: rsmith, aaron.ballman
Reviewed By: rsmith
Subscribers: nemanjai, kbarton, cfe-commits
Tags: #clang
Differential Revision: https://reviews.llvm.org/D53837
llvm-svn: 345594
Sort the headers more correctly according to NetBSD style.
Prevent in this code part clang-format, as shuffling the order
will cause build failures.
llvm-svn: 345586
This fixes an assertion when constant folding a GEP when the part of the offset
was in i32 (IndexSize, as per DataLayout) and part in the i64 (PointerSize) in
the newly created test case.
Differential Revision: https://reviews.llvm.org/D52609
llvm-svn: 345585
Summary:
The final pattern.
There is no test changes:
* We are looking for the pattern with one-use of it's mask,
* If the mask is one-use, D48768 will unfold it into pattern d.
* Thus, the tests have extra-use on the mask.
* Thus, only the BMI2 BZHI can be tested, and it already worked.
* So there is no BMI1 test coverage, we just assume it works since it uses the same codepath.
Reviewers: craig.topper, RKSimon
Reviewed By: RKSimon
Subscribers: llvm-commits
Differential Revision: https://reviews.llvm.org/D53575
llvm-svn: 345584
Summary:
Because of the D48768, that pattern is always unfolded into pattern d,
thus we had no test coverage.
Reviewers: RKSimon, craig.topper
Reviewed By: craig.topper
Subscribers: llvm-commits
Differential Revision: https://reviews.llvm.org/D53574
llvm-svn: 345583
Register new syscall getsockopt2.
Drop removed syscalls pmc_get_info and pmc_control.
While there address compiler warnings about potentially
unused variables.
llvm-svn: 345582
Register new syscall getsockopt2.
Drop removed syscalls pmc_get_info and pmc_control.
While there address compiler warnings about potentially
unused variables.
llvm-svn: 345580
Visual Studio has a bug where it converts the integer literal 2147483648
into an unsigned int instead of a long long (i.e. it follows C89 rules).
The bug has been reported as:
https://developercommunity.visualstudio.com/content/problem/141813/-2147483648-c4146-error.html.
Because of this bug, we were getting a signed/unsigned comparison
warning in VS2015 from the old code (the subsequent unary negation had
no effect on the type).
Reviewed by: sfertile
Differential Revision: https://reviews.llvm.org/D53821
llvm-svn: 345579
Similar to FoldCONCAT_VECTORS, this patch adds FoldBUILD_VECTOR to simplify cases that can avoid the creation of the BUILD_VECTOR - if all the operands are UNDEF or if the BUILD_VECTOR simplifies to a copy.
This exposed an assumption in some AMDGPU code that getBuildVector was guaranteed to be a BUILD_VECTOR node that I've tried to handle.
Differential Revision: https://reviews.llvm.org/D53760
llvm-svn: 345578
Summary:
This patch fixes issues with a stack realignment.
MSVC maintains two frame pointers (`ebx` and `ebp`) for a realigned stack - one
is used for access to function parameters, while another is used for access to
locals. To support this the patch:
- adds an alternative frame pointer (`ebx`);
- considers stack realignment instructions (e.g. `and esp, -32`);
- along with CFA (Canonical Frame Address) which point to the position next to
the saved return address (or to the first parameter on the stack) introduces
AFA (Aligned Frame Address) which points to the position of the stack pointer
right after realignment. AFA is used for access to registers saved after the
realignment (see the test);
Here is an example of the code with the realignment:
```
struct __declspec(align(256)) OverAligned {
char c;
};
void foo(int foo_arg) {
OverAligned oa_foo = { 1 };
auto aaa_foo = 1234;
}
void bar(int bar_arg) {
OverAligned oa_bar = { 2 };
auto aaa_bar = 5678;
foo(1111);
}
int main() {
bar(2222);
return 0;
}
```
and here is the `bar` disassembly:
```
push ebx
mov ebx, esp
sub esp, 8
and esp, -100h
add esp, 4
push ebp
mov ebp, [ebx+4]
mov [esp+4], ebp
mov ebp, esp
sub esp, 200h
mov byte ptr [ebp-200h], 2
mov dword ptr [ebp-4], 5678
push 1111 ; foo_arg
call j_?foo@@YAXH@Z ; foo(int)
add esp, 4
mov esp, ebp
pop ebp
mov esp, ebx
pop ebx
retn
```
Reviewers: labath, zturner, jasonmolenda, stella.stamenova
Reviewed By: jasonmolenda
Subscribers: abidh, lldb-commits
Tags: #lldb
Differential Revision: https://reviews.llvm.org/D53435
llvm-svn: 345577