Summary: I added a new rank to ImplicitConversionRank enum to resolve the function overload ambiguity with vector types. Rank of scalar types conversion is lower than vector splat. So, we can choose which function should we call. See test for more details.
Reviewers: Anastasia, cfe-commits
Reviewed By: Anastasia
Subscribers: bader, yaxunl
Differential Revision: https://reviews.llvm.org/D30816
llvm-svn: 298366
Summary:
First iteration of SDWA peephole.
This pass tries to combine several instruction into one SDWA instruction. E.g. it converts:
'''
V_LSHRREV_B32_e32 %vreg0, 16, %vreg1
V_ADD_I32_e32 %vreg2, %vreg0, %vreg3
V_LSHLREV_B32_e32 %vreg4, 16, %vreg2
'''
Into:
'''
V_ADD_I32_sdwa %vreg4, %vreg1, %vreg3 dst_sel:WORD_1 dst_unused:UNUSED_PAD src0_sel:WORD_1 src1_sel:DWORD
'''
Pass structure:
1. Iterate over machine instruction in basic block and try to apply "SDWA patterns" to each of them. SDWA patterns match machine instruction into either source or destination SDWA operand. E.g. ''' V_LSHRREV_B32_e32 %vreg0, 16, %vreg1''' is matched to source SDWA operand '''%vreg1 src_sel:WORD_1'''.
2. Iterate over found SDWA operands and find instruction that could be potentially coverted into SDWA. E.g. for source SDWA operand potential instruction are all instruction in this basic block that uses '''%vreg0'''
3. Iterate over all potential instructions and check if they can be converted into SDWA.
4. Convert instructions to SDWA.
This review contains basic implementation of SDWA peephole pass. This pass requires additional testing fot both correctness and performance (no performance testing done).
There are several ways this pass can be improved:
1. Make this pass work on whole function not only basic block. As I can see this can be done right now without changes to pass.
2. Introduce more SDWA patterns
3. Introduce mnemonics to limit when SDWA patterns should apply
Reviewers: vpykhtin, alex-t, arsenm, rampitec
Subscribers: wdng, nhaehnle, mgorny
Differential Revision: https://reviews.llvm.org/D30038
llvm-svn: 298365
Summary:
When changing namespaces, the tool adds leading "::" to references that need to
be fully-qualified, which would affect readability.
We avoid adding "::" when the symbol name does not conflict with the new
namespace name. For example, a symbol name "na::nb::X" conflicts with "ns::na"
since it would be resolved to "ns::na::nb::X" in the new namespace.
Reviewers: hokein
Reviewed By: hokein
Subscribers: cfe-commits
Differential Revision: https://reviews.llvm.org/D30493
llvm-svn: 298363
"Write" is an overloaded term. In collectInfo() till buildFlow(), it is
used to mean "must writes". However, within the memory based analysis,
it is used to mean "both may and must writes". Renaming the Write
variable helps clarify this difference.
Reviewers: grosser
Tags: #polly
Differential Revision: https://reviews.llvm.org/D31181
llvm-svn: 298361
This patch fixes an issue in the Optimize LEAs pass where redundant LEAs were
not removed because they were being used by debug values. The debug values are
now ignored when determining whether LEAs are redundant.
For now the debug values for the redundant LEAs are marked as undefined,
effectively lost. The intention is for a follow up patch which will attempt to
preserve the debug values where possible.
Patch by Andrew Ng.
Differential Revision: https://reviews.llvm.org/D30835
llvm-svn: 298360
Previously, PromoteIntRes_TRUNCATE() did not handle the case where
the operand needs widening, which resulted in llvm_unreachable().
This patch adds the needed handling, along with a test case.
Review: Eli Friedman, Simon Pilgrim.
https://reviews.llvm.org/D31077
llvm-svn: 298357
After the loop unroll threshold was increased in r295538, very
large constant expressions can be created. This prevents them
from having to be recursively scanned, leading to a compile
time blow-up.
Differential Revision: https://reviews.llvm.org/D30689
llvm-svn: 298356
Patch moves Sections array to
InputFile (root class for input files).
That allows to detemplate GdbIndexSection.
Differential revision: https://reviews.llvm.org/D30976
llvm-svn: 298345
The def operand of the new LG/LD should have the old def operands
flags and subreg index.
New test: test/CodeGen/SystemZ/fold-memory-op-impl.ll
Review: Ulrich Weigand
llvm-svn: 298341
Original r297566 is splitted into two parts.
This is part one, which adds "RUN" command for test cases.
Unit/arm/call_apsr.S is updated to support thumb1.
It also fixes a bug in arm/aeabi_uldivmod_test.c
gcc_personality_test is XFAILED as the framework cannot handle it so far.
cpu_model_test is also XFAILED for now as it is expected to return non-zero.
TODO: A few tests are XFAILed for armhf and aarch64.
We need further investigating. [1,2] Tracks the issue.
[1] https://bugs.llvm.org//show_bug.cgi?id=32260
[2] https://bugs.llvm.org//show_bug.cgi?id=32261
Reviewers: rengolin, compnerd, jroelofs, erik.pilkington, arphaman
Reviewed By: jroelofs
Subscribers: jroelofs, aemerson, srhines, nemanjai, llvm-commits, mgorny
Differential Revision: https://reviews.llvm.org/D30802
llvm-svn: 298339
CMake variable LLVM_DEFINITIONS collects preprocessor definitions provided
for host compiler that builds llvm components. A function
add_llvm_definitions was introduced in AddLLVMDefinitions.cmake to keep
track of these definitions and was intended to be a replacement for CMake
command add_definitions. Actually in many cases add_definitions is still
used and the content of LLVM_DEFINITIONS is not actual now. On the other
hand the current version of CMake allows getting set of definitions in a
more convenient way. This fix implements evaluation of the variable by
reading corresponding cmake property.
Differential Revision: https://reviews.llvm.org/D31125
llvm-svn: 298336
initialized in the ctor and they're only initialized to 'true' in ClangUserExpression.cpp
when specific languages are detected so we can use uninitialized values. This
bug has been present since the ivars were added in r144042.
<rdar://problem/31105864>
llvm-svn: 298333
This commit adds support for a new attribute that will be used to
distinguish between extensible and inextensible enums. There are three
main purposes of this attribute:
1. Give better control over when enum-related warnings are issued.
For example, in the code below, clang will not issue a -Wassign-enum
warning if the enum is marked "open":
enum __attribute__((enum_extensibility(closed))) EnumClosed {
B0 = 1, B1 = 10
};
enum __attribute__((enum_extensibility(open))) EnumOpen {
C0 = 1, C1 = 10
};
enum EnumClosed ec = 100; // warning issued
enum EnumOpen eo = 100; // no warning
2. Enable code-completion and debugging tools to offer better
suggestions.
3. Make it easier for swift's clang importer to determine which swift
type an enum should be mapped to.
For more details, see the discussion I started on cfe-dev:
http://lists.llvm.org/pipermail/cfe-dev/2017-February/052748.html
rdar://problem/12764379
rdar://problem/23145650
Differential Revision: https://reviews.llvm.org/D30766
llvm-svn: 298332
This fixes a bug introduced by r291559. The Module's FindType was
passing the original name not the basename in the case where it didn't
find any separators. I also added a testcase for this.
<rdar://problem/31159173>
llvm-svn: 298331
Setting dllexport on a declaration has no effect, as we do not emit export
directives for declarations.
Part of the fix for PR32334.
Differential Revision: https://reviews.llvm.org/D31162
llvm-svn: 298330
The glueless lowering of addc/adde in Thumb1 has known serious
miscompiles (see https://reviews.llvm.org/D31081), and r297820
causes an infinite loop for certain constructs. It's not
clear when they will be fixed, so let's just take them out
of the tree for now.
(I resolved a small conflict with r297453.)
llvm-svn: 298328
Summary:
This also delays setting the output filename based on the first input
argument until after processing /def.
Fixes PR32354
Reviewers: ruiu, pcc
Subscribers: llvm-commits
Differential Revision: https://reviews.llvm.org/D31152
llvm-svn: 298327
This analyzes the dependency graph and computes all minimal
cycles. Equivalent cycles that differ only by rotation are
excluded, as are cycles that are "super-cycles" of other
smaller cycles. For example, if we discover the cycle
A -> C -> A, and then later A -> B -> C -> D -> A, this latter
cycle is not considered. Thus, it is possible that after
eliminating some cycles, new ones will appear. However,
this is the only way to make the algorithm terminate in
a reasonable amount of time.
llvm-svn: 298324
In doing so, clean up the MD5 interface a little. Most
existing users only care about the lower 8 bytes of an MD5,
but for some users that care about the upper and lower,
there wasn't a good interface. Furthermore, consumers
of the MD5 checksum were required to handle endianness
details on their own, so it seems reasonable to abstract
this into a nicer interface that just gives you the right
value.
Differential Revision: https://reviews.llvm.org/D31105
llvm-svn: 298322
The special case of zero sized values was previously not handled correctly.
This patch handles this by not promoting if the size is zero.
Patch by Tim Neumann.
Differential Revision: https://reviews.llvm.org/D31116
llvm-svn: 298320
- Fix a variable naming mismatch
- Fix gcc extension pointer arithmetic on void to cast to char *.
- Test that the header (and htmintrin.h) parse.
llvm-svn: 298318