Patch replaces old isEquivalentGEP implementation, and changes type of
comparison result from bool (equal or not) to {-1, 0, 1} (less, equal, greater).
This patch belongs to patch series that improves MergeFunctions
performance time from O(N*N) to O(N*log(N)).
llvm-svn: 208976
Patch replaces old isEquivalentOperation implementation, and changes type of
comparison result from bool (equal or not) to {-1, 0, 1} (less, equal, greater).
This patch belongs to patch series that improves MergeFunctions
performance time from O(N*N) to O(N*log(N)).
llvm-svn: 208973
TableGen has a fairly dubious heuristic to decide whether an alias should be
printed: does the alias have lest operands than the real instruction. This is
bad enough (particularly with no way to override it), but it should at least be
calculated consistently for both strings.
This patch implements that logic: first get the *correct* string for the
variant, in the same way as the Matcher, without guessing; then count the
number of whitespace chars.
There are basically 4 changes this brings about after the previous
commits; all of these appear to be good, so I have changed the tests:
+ ARM64: we print "neg X, Y" instead of "sub X, xzr, Y".
+ ARM64: we skip implicit "uxtx" and "uxtw" modifiers.
+ Sparc: we print "mov A, B" instead of "or %g0, A, B".
+ Sparc: we print "fcmpX A, B" instead of "fcmpX %fcc0, A, B"
llvm-svn: 208969
The canonical syntax is "fcmXY ..., #0.0".
This will be tested when the TableGen "should I print this Alias" heuristic is
fixed (very soon).
llvm-svn: 208968
This alias appears not to have an appropriate PrintMethod. Normally, I'd look
into it, but since AArch64 is disappearing soon it's probably not worth it.
This will be tested when the TableGen "should I print this Alias" heuristic is
fixed (very soon).
llvm-svn: 208967
These aliases are handled entirely in C++ and only having TableGen InstAliases
for some of them was confusing LLVM.
This will be tested when the TableGen "should I print this Alias" heuristic is
fixed (very soon).
llvm-svn: 208966
Certainly not without having a custom PrintMethod to invert the immediate
beforehand. But probably not at all.
This will be tested when the TableGen "should I print this Alias" heuristic is
fixed (very soon).
llvm-svn: 208964
In AT&T syntax, we should probably print the full "movl" or "movw". TableGen
used to ignore these aliases because it was miscounting the number of operands.
This fixes the issue.
This will be tested when the TableGen "should I print this Alias"
heuristic is fixed (very soon).
llvm-svn: 208963
Actually, MOV sometimes is canonical, but for now this is a better
approximation than what's there.
This will be tested when the TableGen "should I print this Alias" heuristic is
fixed (very soon).
llvm-svn: 208962
You can perform (say) an fcmle operation by swapping the operands on an fcmge,
but it shouldn't be printed like that.
This will be tested when the TableGen "should I print this Alias" heuristic is
fixed (very soon).
llvm-svn: 208961
We accept "ldr w3, [x1, #-1]" as a convenience, but we should still print the
canonical "ldur" form.
This will be tested when the TableGen "should I print this Alias" heuristic is
fixed (very soon).
llvm-svn: 208960
If an ANDS instruction has Rd == ZR it should be printed as TST since
its only effect is on the flags register NZCV.
This will be tested when the TableGen "should I print this Alias"
heuristic is fixed (very soon).
llvm-svn: 208959
MOV is almost always the right thing to print if possile. People understand it.
This will be tested when the TableGen "should I print this Alias" heuristic is
fixed (very soon).
llvm-svn: 208958
For example, the full instruction "sub w0, wzr, w1, uxtw" could print as either
"neg w0, w1" or "sub w0, wzr, w1". The former is better.
This will be tested when the TableGen "should I print this Alias" heuristic is
fixed (very soon).
llvm-svn: 208957
You can write "lslv w0, w1, w2" (probably for legacy reasons), but it should be
printed as simply "lsl".
This will be tested when the TableGen "should I print this Alias" heuristic is
fixed (very soon).
llvm-svn: 208956
Add some Windows on ARM specific library calls. These are provided by msvcrt,
and can be used to perform integer to floating-point conversions (and
vice-versa) mirroring similar functions in the RTABI.
llvm-svn: 208949
Sometimes a LLVM compilation may take more time then a client would like to
wait for. The problem is that it is not possible to safely suspend the LLVM
thread from the outside. When the timing is bad it might be possible that the
LLVM thread holds a global mutex and this would block any progress in any other
thread.
This commit adds a new yield callback function that can be registered with a
context. LLVM will try to yield by calling this callback function, but there is
no guaranteed frequency. LLVM will only do so if it can guarantee that
suspending the thread won't block any forward progress in other LLVM contexts
in the same process.
Once the client receives the call back it can suspend the thread safely and
resume it at another time.
Related to <rdar://problem/16728690>
llvm-svn: 208945
Allow multiple raw profiles to coexist in a single .profraw file,
given the following conditions:
- Zero padding at the end of or between profiles will be skipped.
- Each profile must start with a valid header.
- Mixing endianness or pointer sizes in concatenated profiles files is
not allowed.
This is needed to handle cases where a program's shared libraries are
profiled as well as the main executable itself, as we'll need to emit
each executable's counters. Combining the tables in the runtime would
be expensive for the instrumented program.
rdar://16918688
llvm-svn: 208938
This commit implements two command line switches -global-merge-on-external
and -global-merge-aligned, and both of them are false by default, so this
optimization is disabled by default for all targets.
For ARM64, some back-end behaviors need to be tuned to get this optimization
further enabled.
llvm-svn: 208934
Since type units in the dwo file are handled by a debug aware tool, they
don't need to leverage the ELF comdat grouping to implement
deduplication. Avoid creating all the .group sections for these as a
space optimization.
llvm-svn: 208930
It is more appropriate than the current situation, when one flag
(AbsoluteFilePath) is relevant only if another flag is set.
This refactoring would also simplify fetching the short function name
(stored in DW_AT_name) instead of a linkage name returned currently.
No functionality change.
llvm-svn: 208921
The allocas going out of scope are immediately killed by the return
instruction.
This is a resend of r208912, which was committed accidentally.
Reviewers: chandlerc
Differential Revision: http://reviews.llvm.org/D3792
llvm-svn: 208920
We have to iterate over all the calls that were inlined to find out if
any were musttail.
Sink another variable down to where its used.
llvm-svn: 208913
The allocas going out of scope are immediately killed by the return
instruction.
Reviewers: chandlerc
Differential Revision: http://reviews.llvm.org/D3630
llvm-svn: 208912
The interesting case is what happens when you inline a musttail call
through a musttail call site. In this case, we can't break perfect
forwarding or allow any stack growth.
Instead of merging control flow from the inlined return instruction
after a musttail call into the body of the caller, leave the inlined
return instruction in the caller so that the musttail call stays in the
tail position.
More work is required in http://reviews.llvm.org/D3630 to handle the
case where the inlined function has dynamic allocas or byval arguments.
Reviewers: chandlerc
Differential Revision: http://reviews.llvm.org/D3491
llvm-svn: 208910
The symlink needs to point to a relative path, so we don't break
building in a chroot.
Tested-by: Laurent Carlier <lordheavym@gmail.org>
llvm-svn: 208908
Added target specific combine rules to fold blend intrinsics according
to the following rules:
1) fold(blend A, A, Mask) -> A;
2) fold(blend A, B, <allZeros>) -> A;
3) fold(blend A, B, <allOnes>) -> B.
Added two new tests to verify that the new folding rules work for all
the optimized blend intrinsics.
llvm-svn: 208895
We now use SReg_* for integer types and VReg_* for floating-point types.
This should help simplify the SIFixSGPRCopies pass and no longer causes
ISel to insert a COPY after termiator instuctions that output a value.
This change is covered by exisitng tests.
llvm-svn: 208888
Previously, TableGen assumed that every aliased operand consumed precisely 1
MachineInstr slot (this was reasonable because until a couple of days ago,
nothing more complicated was eligible for printing).
This allows a couple more ARM64 aliases to print so we can remove the special
code.
On the X86 side, I've gone for explicit AT&T size specifiers as the default, so
turned off a few of the aliases that would have just started printing.
llvm-svn: 208880
In all cases, if a "mov" alias exists, it is the canonical form of the
instruction. Now that TableGen can support aliases containing syntax variants,
we can enable them and improve the quality of the asm output.
llvm-svn: 208874
Previously, we ignored the difference between V64 and V128 when parsing
assembly: they both got mapped to registers in the FPR128 class. This is
basically harmless at the moment because they both print and encode the same
way. However, it will affect the printing of aliases.
llvm-svn: 208866
Summary:
No support for symbols in place of the immediate yet since it requires new
relocations.
Depends on D3671
Reviewers: jkolek, zoran.jovanovic, vmedic
Reviewed By: vmedic
Differential Revision: http://reviews.llvm.org/D3689
llvm-svn: 208858
much more effectively when trying to constant fold a load of a constant.
Previously, we only handled bitcasts by trying to find a totally generic
byte representation of the constant and use that. Now, we look through
the bitcast to see what constant we might fold the load into, and then
try to form a constant expression cast of the found value that would be
equivalent to loading the value.
You might wonder why on earth this actually matters. Well, turns out
that the Itanium ABI causes us to create a single array for a vtable
where the first elements are virtual base offsets, followed by the
virtual function pointers. Because the array is homogenous the element
type is consistently i8* and we inttoptr the virtual base offsets into
the initial elements.
Then constructors bitcast these pointers to i64 pointers prior to
loading them. Boom, no more constant folding of virtual base offsets.
This is the first fix to LLVM to address the *insane* performance Eric
Niebler discovered with Clang on his range comprehensions[1]. There is
more to come though, this doesn't *really* fix the problem fully.
[1]: http://ericniebler.com/2014/04/27/range-comprehensions/
llvm-svn: 208856
Summary:
They aren't implemented for any ISA at the moment.
Depends on D3670
Reviewers: jkolek, zoran.jovanovic, vmedic
Reviewed By: vmedic
Differential Revision: http://reviews.llvm.org/D3671
llvm-svn: 208855
if ((x & C) == 0) x |= C becomes x |= C
if ((x & C) != 0) x ^= C becomes x &= ~C
if ((x & C) == 0) x ^= C becomes x |= C
if ((x & C) != 0) x &= ~C becomes x &= ~C
if ((x & C) == 0) x &= ~C becomes nothing
Z3 Verifications code for above transform
http://rise4fun.com/Z3/Pmsh
Differential Revision: http://reviews.llvm.org/D3717
llvm-svn: 208848
Summary:
This gets rid of a sub instruction by moving the negation to the
constant when valid.
Reviewers: nicholas
Subscribers: llvm-commits
Differential Revision: http://reviews.llvm.org/D3773
llvm-svn: 208827
Abstract variables should never have/use locations. In this case the
data wasn't used, so no functional change intended here, just
simplification.
llvm-svn: 208820
Many old tests using prior schemas still had some brokenness here (both
indirect arrays and arrays with single bogus elements). Fixed those up
so they don't hit the new assertions.
Also reduced nesting in some places, etc.
llvm-svn: 208817
This is just unneccessary - we only create abstract definitions when
we're inlining anyway, so there's no reason to delay this to see if
we're going to inline anything.
llvm-svn: 208798
If the function has the landingpad instruction, then the
handlerdata should be emitted even if the function has
nouwnind attribute. Otherwise, following code will not
work:
void test1() noexcept {
try {
throw_exception();
} catch (...) {
log_unexpected_exception();
}
}
Since the cantunwind was incorrectly emitted and the
LSDA is not available.
llvm-svn: 208791
The commit r208166 will cause some regression on ARM EHABI.
This fix has been committed in r208715, and an assertion failure
test case has been committed in r208770.
This commit further extends the unittest so that the actual
value in the handlerdata will be checked.
llvm-svn: 208790
For example
tzcntl %edi, %ebx
testl %edi, %edi
je .label
can be rewritten into
tzcntl %edi, %ebx
jb .label
A minor complication is that tzcnt sets CF instead of ZF when the input
is zero, we have to rewrite users of the flags from ZF to CF. Currently
we recognize patterns using lzcnt, tzcnt and popcnt.
Differential Revision: http://reviews.llvm.org/D3454
llvm-svn: 208788
Summary:
To limit the number of tests required, only one 64-bit ISA prior to MIPS64 are tested.
rdhwr has been deliberately left without an ISA annotation for now. This is
because the assembler and CodeGen disagree on when the instruction is
available. Strictly speaking, it is only available in MIPS32r2 and
MIPS64r2. However, it is emulated by a kernel trap on earlier ISA's and is
necessary for TLS so CodeGen should emit it on older ISA's too.
Depends on D3697
Reviewers: vmedic
Reviewed By: vmedic
Differential Revision: http://reviews.llvm.org/D3698
llvm-svn: 208785
Summary:
Also use named constants for common opcode fields.
Depends on D3669
Reviewers: vmedic, zoran.jovanovic, jkolek
Reviewed By: jkolek
Differential Revision: http://reviews.llvm.org/D3670
llvm-svn: 208784
Most importantly, it gives debug location info to the coverage callback.
This change also removes 2 cases of unnecessary setDebugLoc when IRBuilder
is created with the same debug location.
llvm-svn: 208767
The ELF header e_flags field in the MIPS related test cases handled
incorrectly. The obj2yaml prints too many flags. I will fix that in the
next patches.
The patch reviewed by Michael Spencer and Sean Silva.
llvm-svn: 208752
The UDF instruction is a reserved undefined instruction space. The assembler
mnemonic was introduced with ARM ARM rev C.a. The instruction is not predicated
and the immediate constant is ignored by the CPU. Add support for the three
encodings for this instruction.
The changes to the invalid instruction test is due to the fact that the invalid
instructions actually overlap with the undefined instruction. Introduction of
the new instruction results in a partial decode as an undefined sequence. Drop
the tests as they are invalid instruction patterns anyways.
llvm-svn: 208751
This was reverted in r208642 due to regressions surrounding file changes
within lexical scopes causing inlining information to be lost.
The issue was in LexicalScopes::getOrCreateInlinedScope, where I was
previously testing "isLexicalBlock" which is false for
"DILexicalBlockFile" (a scope used to represent changes in the current
file name) and assuming it was then a function (breaking out of the
inlined scope path and reaching for the parent non-inlined scopes). By
inverting the condition and testing for "isSubprogram" the correct
behavior is attained.
(also found some weirdness in Clang, see r208742 when reducing this test
case - the resulting test case doesn't apply with the Clang fix, but
I've added a more realistic test case to inline-scopes.ll which does
reproduce the issue and demonstrate the fix)
llvm-svn: 208748
libraries before linking and executing the target objects.
This allows programs that use external calls (e.g. to libc) to be run under
llvm-rtdyld.
llvm-svn: 208739
member variable and sink the initialization of crbits into the
subtarget feature reset code.
No functional change, but this refactor will be used in a future
commit.
llvm-svn: 208726
We were using libLLVM-Major.Minor.Patch.so for the soname, but we
need the soname to stay consistent for all Major.Minor.* releases
otherwise operating system distributors will need to rebuild all
packages that link with LLVM every time there is a new point release.
This patch also reverses the compatibility symlink, so
libLLVM-Major.Minor.Patch.so is now a symlink that points
to libLLVM-Major-Minor.so.
llvm-svn: 208721
This allows code to statically accept a Function or a GlobalVariable, but
not an alias. This is already a cleanup by itself IMHO, but the main
reason for it is that it gives a lot more confidence that the refactoring to fix
the design of GlobalAlias is correct. That will be a followup patch.
llvm-svn: 208716
This commit was already commited as revision rL208689 and discussd in
phabricator revision D3704.
But the test file was crashing on OS X and windows.
I fixed the test file in the same way as in rL208340.
llvm-svn: 208711
We were using libLLVM-Major.Minor.Patch.so for the soname, but we
need the soname to stay consistent for all Major.Minor.* releases
otherwise operating system distributors will need to rebuild all
packages that link with LLVM every time there is a new point release.
This patch also reverses the compatibility symlink, so
libLLVM-Major.Minor.Patch.so is now a symlink that points
to libLLVM-Major-Minor.so.
llvm-svn: 208708
compared to 'AddrMode.BaseReg'. In the case that 'AddrMode.BaseReg' is
nullptr, 'Result' will also be nullptr, so the cast causes an assertion. We
should use dyn_cast_or_null here to check 'Result' is not null and it is an
instruction.
Bug found by Mats Petersson, and I reduced his IR to get a test case.
llvm-svn: 208705
Summary:
This required a new instruction group representing the 32-bit subset of
MIPS-3 that was available in MIPS32R2.
To limit the number of tests required, only one 32-bit and one 64-bit ISA
prior to MIPS32/MIPS64 are tested.
rdhwr has been deliberately left without an ISA annotation for now. This is
because the assembler and CodeGen disagree on when the instruction is
available. Strictly speaking, it is only available in MIPS32r2 and
MIPS64r2. However, it is emulated by a kernel trap on earlier ISA's and is
necessary for TLS so CodeGen should emit it on older ISA's too.
Depends on D3696
Reviewers: vmedic
Reviewed By: vmedic
Differential Revision: http://reviews.llvm.org/D3697
llvm-svn: 208690
Summary:
We are currently very close to the 32-bit limit of the current assembler
implementation. This is because there is no way to represent an instruction
that is available in, for example, Mips3 or Mips32. We have to define a
feature bit that represents this.
This patch cleans up a pair of redundant feature bits and slightly postpones the
point we will reach the limit.
Reviewers: zoran.jovanovic, jkolek, vmedic
Reviewed By: vmedic
Differential Revision: http://reviews.llvm.org/D3703
llvm-svn: 208685
We already had an assert for foo->RAUW(foo), but not for something like
foo->RAUW(GEP(foo)) and would go in an infinite loop trying to apply
the replacement.
llvm-svn: 208663
Normally, patterns like (add x, (setcc cc ...)) will be folded into
(csel x, x+1, not cc). However, if there is a ZEXT after SETCC, they
won't be folded. This patch recognizes the ZEXT and allows the
generation of CSINC.
This patch fixes bug 19680.
llvm-svn: 208660
The problem occurs when a non-i1 setcc is inverted. For example 'i8 = setcc' will get 'xor 0xff' to invert this. This is clearly wrong when the boolean contents are ZeroOrOne.
This patch introduces getLogicalNOT and updates SetCC legalisation to use it.
Reviewed by Hal Finkel.
llvm-svn: 208641