Summary:
This patch adds templated functions to MachineIRBuilder for some opcodes
and adds pattern matcher support for G_AND and G_OR.
Reviewers: aditya_nandakumar
Reviewed By: aditya_nandakumar
Subscribers: rovka, kristof.beyls, llvm-commits
Differential Revision: https://reviews.llvm.org/D43309
llvm-svn: 325162
Summary:
Frequently, a percent in protos denotes a formatting specifier for string replacement.
Thus it is desirable to keep the percent together with what follows after it.
Reviewers: djasper
Reviewed By: djasper
Subscribers: klimek, cfe-commits
Differential Revision: https://reviews.llvm.org/D43294
llvm-svn: 325159
Previously we only invalidated the pressure set limit cached when the TargetRegisterInfo pointer changes. But as reserved regs and callee saved regs are used as part of calculating the limits we should invalidate when those change too.
I encountered this when reverting a patch from the 6.0 branch. One of the x86 test files had a function that used rbp as a frame pointer, making it reserved. It was followed by another function which didn't use rbp but had the same TRI so the pressure set limit cache was not invalidated. If i removed the function that used rbp as a frame pointer from the file, the remaining function then got a different register pressure limit for the GR16 pressure set. This caused the machine scheduler to change the scheduling for the function. This was an unexpected change from just deleting a function.
I don't have a test case for trunk because the particular x86 test case is different enough from the 6.0 branch to not be affected now.
Differential Revision: https://reviews.llvm.org/D43274
llvm-svn: 325153
This patch addresses a minor compatibility issue with GNU linkers.
Previously, --export-dynamic-symbol is completely ignored if you
pass --export-dynamic together.
Differential Revision: https://reviews.llvm.org/D43266
llvm-svn: 325152
Move computeLoopSafetyInfo, defined in Transforms/Utils/LoopUtils.h,
into the corresponding LoopUtils.cpp, as opposed to LICM where it resides
at the moment. This will allow other functions from Transforms/Utils
to reference it.
llvm-svn: 325151
This brings wasm into line with ELF and COFF in terms of
symbol types are represented.
Differential Revision: https://reviews.llvm.org/D43112
llvm-svn: 325150
Try to keep PACK*SDW/PACK*SWB as wide as possible, this helps ComputeNumSignBits as it can only peek through bitcasts to wider types, pre-AVX2 codegen was already doing this as it could peek through bitcasts/subvectors more easily than AVX2 could through shuffles.
This shouldn't affect existing results as calls to truncateVectorWithPACK ensure we have enough sign bits to pack to the same value, but it should make it possible to use truncateVectorWithPACK chains to perform saturation in combineTruncateWithSat with a future patch.
llvm-svn: 325149
The select may have been preventing a division by zero or INT_MIN/-1 so removing it might not be safe.
Fixes PR36362.
Differential Revision: https://reviews.llvm.org/D43276
llvm-svn: 325148
Kernel arguments likely read by all workitems and should not bypass
cache. Fixes performance hit in sub-dword argument loads.
Differential Revision: https://reviews.llvm.org/D43249
llvm-svn: 325146
Summary:
-ast-print prints omp pragmas with a trailing space. While this
behavior is likely of little concern to most users, surely it's
unintentional, and it's annoying for some source-level work I'm
pursuing. This patch focuses on omp pragmas, but it also fixes
init_seg and loop hint pragmas because they share implementation.
The testing strategy here is to add usually just one '{{$}}' per
relevant -ast-print test file. This seems to achieve good code
coverage. However, this strategy is probably easy to forget as the
tests evolve. That's probably fine as this fix is far from critical.
The main goal of the testing is to aid the initial review.
This patch also adds a fixme for "#pragma unroll", which prints as
"#pragma unroll (enable)", which is invalid syntax.
Reviewers: ABataev
Reviewed By: ABataev
Subscribers: guansong, cfe-commits
Differential Revision: https://reviews.llvm.org/D43204
llvm-svn: 325145
The prologue-end line record must be emitted after the last
instruction that is part of the function frame setup code and before
the instruction that marks the beginning of the function body.
Patch by Carlos Alberto Enciso!
Differential Revision: https://reviews.llvm.org/D41762
llvm-svn: 325143
This keeps with our current usage of 'match' and is easier to see that
the optional NSW only applies in the non-constant operand case.
llvm-svn: 325140
So that macros defined in inline assembly blocks are available to the
whole file.
This provides a consistent behavior with other assembly directives,
since equations for example are already preserved between inline
assembly blocks.
PR: 36110
Patch by Roger!
llvm-svn: 325139
According to the CUDA Programming Guide this is prohibited in
whole program compilation mode. This makes sense because external
references cannot be satisfied in that mode anyway. However,
such variables are allowed in separate compilation mode which
is a valid use case.
Differential Revision: https://reviews.llvm.org/D42923
llvm-svn: 325136
When creating high MachineMemOperand for MSTORE/MLOAD we supply
it with the original PointerInfo, while the pointer itself had been incremented.
The patch adds the proper offset to the PointerInfo.
llvm-svn: 325135
Summary:
Reversed loads are handled as gathering. But we can just reshuffle
these values. Patch adds support for vectorization of reversed loads.
Reviewers: RKSimon, spatel, mkuper, hfinkel
Subscribers: llvm-commits
Differential Revision: https://reviews.llvm.org/D43022
llvm-svn: 325134
This broke the Chromium build on Windows; see https://crbug.com/812231
> Fix for PR32992. Static const classes not exported.
>
> Patch by zahiraam!
>
> Differential Revision: https://reviews.llvm.org/D42968
llvm-svn: 325133
Prior to this patch, same instance of VFS was shared for concurrent
processing of the files in ClangdThreadingTest.StressTest.
It caused a data race as the same instance of InMemoryFileSystem was
mutated from multiple threads using setCurrentWorkingDirectory().
llvm-svn: 325132
This affects all outlined functions, not just tasks! Only show warning
when using Clang 5.0 or later.
Differential Revision: https://reviews.llvm.org/D43190
llvm-svn: 325131
If a load follows a store and reloads data that the store has written to memory, Intel microarchitectures can in many cases forward the data directly from the store to the load, This "store forwarding" saves cycles by enabling the load to directly obtain the data instead of accessing the data from cache or memory.
A "store forward block" occurs in cases that a store cannot be forwarded to the load. The most typical case of store forward block on Intel Core microarchiticutre that a small store cannot be forwarded to a large load.
The estimated penalty for a store forward block is ~13 cycles.
This pass tries to recognize and handle cases where "store forward block" is created by the compiler when lowering memcpy calls to a sequence
of a load and a store.
The pass currently only handles cases where memcpy is lowered to XMM/YMM registers, it tries to break the memcpy into smaller copies.
breaking the memcpy should be possible since there is no atomicity guarantee for loads and stores to XMM/YMM.
Change-Id: Ic41aa9ade6512e0478db66e07e2fde41b4fb35f9
llvm-svn: 325128
While the AVX512 VTRUNCS/VTRUNCUS instructions require legal types, truncateVectorWithPACK handles cases with multiples of legal types through splitting/concatenation. So we just need to ensure that the src/dst scalar types are correct and leave truncateVectorWithPACK to handle the rest of it.
llvm-svn: 325127
For basic blocks with instructions between the beginning of the block
and a call we have to duplicate the instructions before the call in all
split blocks and add PHI nodes for uses of the duplicated instructions
after the call.
Currently, the threshold for the number of instructions before a call
is quite low, to keep the impact on binary size low.
Reviewers: junbuml, mcrosier, davidxl, davide
Reviewed By: junbuml
Differential Revision: https://reviews.llvm.org/D41860
llvm-svn: 325126
There are a number of different situations when symbols are requested
to be ordered in the --symbol-ordering-file that cannot be ordered for
some reason. To assist with identifying these symbols, and either
tidying up the order file, or the inputs, a number of warnings have
been added. As some users may find these warnings unhelpful, due to how
they use the symbol ordering file, a switch has also been added to
disable these warnings.
The cases where we now warn are:
* Entries in the order file that don't correspond to any symbol in the input
* Undefined symbols
* Absolute symbols
* Symbols imported from shared objects
* Symbols that are discarded, due to e.g. --gc-sections or /DISCARD/ linker script sections
* Multiple of the same entry in the order file
Reviewed by: rafael, ruiu
Differential Revision: https://reviews.llvm.org/D42475
llvm-svn: 325125
We can use incremental dominator tree updates to avoid re-calculating
the dominator tree after interchanging 2 loops.
Reviewers: dmgreen, kuhar
Reviewed By: kuhar
Differential Revision: https://reviews.llvm.org/D43176
llvm-svn: 325122
Commit https://reviews.llvm.org/rL324489 added
EXPECT_EQ(false, N->isUnsigned());
which older GCC versions dislike for some reason. Anyway, it looks like the
proper GTest way is to use EXPECT_FALSE, etc.
Differential Revision: https://reviews.llvm.org/D43233
llvm-svn: 325121
This patch fixes clang to not consider braced initializers for
aggregate elements of arrays to be potentially dependent on the
indices of the initialized elements. Resolves bug 18978:
initialize a large static array = clang oom?
https://bugs.llvm.org/show_bug.cgi?id=18978
Differential Revision: https://reviews.llvm.org/D43187
llvm-svn: 325120
Preserve debug info from a dead 'and' instruction with a constant.
Patch by Djordje Todorovic.
Differential Revision: https://reviews.llvm.org/D43163
llvm-svn: 325119
Summary:
According to the C++11 standard [dcl.type.simple]p4:
The type denoted by decltype(e) is defined as follows:
- if e is an unparenthesized id-expression or an unparenthesized
class member access (5.2.5), decltype(e) is the type of the entity
named by e.
Currently Clang handles the 'member access' case incorrectly for
static data members (decltype returns T& instead of T). This patch
fixes the issue.
Reviewers: faisalv, rsmith, rogfer01
Reviewed By: rogfer01
Subscribers: rogfer01, cfe-commits
Differential Revision: https://reviews.llvm.org/D42969
llvm-svn: 325117