VGPRs are spilled to LDS. This still needs more testing, but
we need to at least enable it at -O0, because the fast register
allocator spills all registers that are live at the end of blocks
and without this some future commits will break the
flat-address-space.ll test.
v2: Only calculate thread id once
v3: Move insertion of spill instructions to
SIRegisterInfo::eliminateFrameIndex()
llvm-svn: 218348
the native AVX2 instructions.
Note that the test case is really frustrating here because VPERMD
requires the mask to be in the register input and we don't produce
a comment looking through that to the constant pool. I'm going to
attempt to improve this in a subsequent commit, but not sure if I will
succeed.
llvm-svn: 218347
detection. It was incorrectly handling undef lanes by actually treating
an undef lane in the first 128-bit lane as a *numeric* shuffle value.
Fortunately, this almost always DTRT and disabled detecting repeated
patterns. But not always. =/ This patch introduces a much more
principled approach and fixes the miscompiles I spotted by inspection
previously.
llvm-svn: 218346
The export table descriptor is a data structure to keep information
about the export table. It contains a symbol name, and the name may
or may not be mangled.
We need unmangled names for the export table, so we demangle them
before writing them to the export table.
Obviously this is not a correct round-trip conversion. That could
drop a leading underscore from a symbol because that's
indistinguishable from a mangled name.
What we need to do is to keep unmangled names. This patch does that.
llvm-svn: 218345
/machine:ebc was previously recognized but rejected. Unknown architecture
names were handled differently but eventually rejected too. We don't need
to distinguish them.
llvm-svn: 218344
This patch changes the type of export table set from std::set to
std::vector. The new code is slightly inefficient, but because
export table elements are actually mutable, std::vector is better
here. No functionality change.
llvm-svn: 218343
If two or more /export options are given for the same symbol, we should
always print a warning message and use the first one regardless of other
parameters.
Previously there was a case that the first parameter is not used.
llvm-svn: 218342
This testcase was not testing what it meant: because there were only two checks for
dmb {{ish}} in the second function, it could have missed a bug where one of the three
required dmb {{ish}} became dmb {{ishst}}. As I was fixing it, I also added
CHECK-LABELs to make it a bit less brittle.
llvm-svn: 218341
Usually, overriding a virtual function defined in a virtual base
required emission of a vtordisp slot in the record. However no vtordisp
is needed if the overriding function is pure; it should be impossible to
observe the pure virtual method.
This fixes PR21046.
llvm-svn: 218340
lists. Since the fields are inititalized one at a time, using a field with
lower index to initialize a higher indexed field should not be warned on.
llvm-svn: 218339
shuffles using the AVX2 instructions. This is the first step of cutting
in real AVX2 support.
Note that I have spotted at least one bug in the test cases already, but
I suspect it was already present and just is getting surfaced. Will
investigate next.
llvm-svn: 218338
Rather than slurping in and splatting out the whole ctor list, preserve
the existing array entries without trying to understand them. Only
remove the entries that we know we can optimize away. This way we don't
need to wire through priority and comdats or anything else we might add.
Fixes a linker issue where the .init_array or .ctors entry would point
to discarded initialization code if the comdat group from the TU with
the faulty global_ctors entry was dropped.
llvm-svn: 218337
e.g., add w1, w2, w3, lsl #(2 - 1)
This sort of thing comes up in pre-processed assembly playing macro games.
Still validate that it's an assembly time constant. The early exit error check
was just a bit overzealous and disallowed a left paren.
rdar://18430542
llvm-svn: 218336
add VPBLENDD to the InstPrinter's comment generation so we get nice
comments everywhere.
Now that we have the nice comments, I can see the bug introduced by
a silly typo in the commit that enabled VPBLENDD, and have fixed it. Yay
tests that are easy to inspect.
llvm-svn: 218335
There are new register classes VCSrc_* which represent operands that
can take an SGPR, VGPR or inline constant. The VSrc_* class is now used
to represent operands that can take an SGPR, VGPR, or a 32-bit
immediate.
This allows us to have more accurate checks for legality of
immediates, since before we had no way to distinguish between operands
that supported any 32-bit immediate and operands which could only
support inline constants.
llvm-svn: 218334
lexer, add the token buffer underneath the caching lexer where possible and
push the tokens directly into the caching lexer otherwise. We previously
put the lexer into a corrupted state where we could not guarantee to provide
the tokens in the right order and would sometimes assert.
llvm-svn: 218333
Summary:
AtomicExpand already had logic for expanding wide loads and stores on LL/SC
architectures, and for expanding wide stores on CmpXchg architectures, but
not for wide loads on CmpXchg architectures. This patch fills this hole,
and makes use of this new feature in the X86 backend.
Only one functionnal change: we now lose the SynchScope attribute.
It is regrettable, but I have another patch that I will submit soon that will
solve this for all of AtomicExpand (it seemed better to split it apart as it
is a different concern).
Test Plan: make check-all (lots of tests for this functionality already exist)
Reviewers: jfb
Subscribers: llvm-commits
Differential Revision: http://reviews.llvm.org/D5404
llvm-svn: 218332
Summary:
This patch makes use of AtomicExpandPass in Power for inserting fences around
atomic as part of an effort to remove fence insertion from SelectionDAGBuilder.
As a big bonus, it lets us use sync 1 (lightweight sync, often used by the mnemonic
lwsync) instead of sync 0 (heavyweight sync) in many cases.
I also added a test, as there was no test for the barriers emitted by the Power
backend for atomic loads and stores.
Test Plan: new test + make check-all
Reviewers: jfb
Subscribers: llvm-commits
Differential Revision: http://reviews.llvm.org/D5180
llvm-svn: 218331
that function, and apart from being slow, this is unnecessary: ADL can trigger
instantiations that are not permitted here. The standard isn't *completely*
clear here, but this seems like the intent, and in any case this approach is
permitted by [temp.inst]p7.
llvm-svn: 218330
Summary:
The goal is to eventually remove all the code related to getInsertFencesForAtomic
in SelectionDAGBuilder as it is wrong (designed for ARM, not really portable, works
mostly by accident because the backends are overly conservative), and repeats the
same logic that goes in emitLeading/TrailingFence.
In this patch, I make AtomicExpandPass insert the fences as it knows better
where to put them. Because this requires getting the fences and not just
passing an IRBuilder around, I had to change the return type of
emitLeading/TrailingFence.
This code only triggers on ARM for now. Because it is earlier in the pipeline
than SelectionDAGBuilder, it triggers and lowers atomic accesses to atomic so
SelectionDAGBuilder does not add barriers anymore on ARM.
If this patch is accepted I plan to implement emitLeading/TrailingFence for all
backends that setInsertFencesForAtomic(true), which will allow both making them
less conservative and simplifying SelectionDAGBuilder once they are all using
this interface.
This should not cause any functionnal change so the existing tests are used
and not modified.
Test Plan: make check-all, benefits from existing tests of atomics on ARM
Reviewers: jfb, t.p.northover
Subscribers: aemerson, llvm-commits
Differential Revision: http://reviews.llvm.org/D5179
llvm-svn: 218329
VPBLENDD where appropriate even on 128-bit vectors.
According to Agner's tables, this instruction is significantly higher
throughput (can execute on any port) on Haswell chips so we should
aggressively try to form it when available.
Sadly, this loses our delightful shuffle comments. I'll add those back
for VPBLENDD next.
llvm-svn: 218322
This patch removes the old JIT memory manager (which does not provide any
useful functionality now that the old JIT is gone), and migrates the few
remaining clients over to SectionMemoryManager.
http://llvm.org/PR20848
llvm-svn: 218316
This script supports displaying developer-focused backtraces when working
with mixed Java and C/C++ stack frames within lldb. On Android, this represents
just about every app, since all apps start in Java code.
The script currently supports the Art JVM when run on host-side x86_64 and x86,
but does require a patch not yet accepted in AOSP:
AOSP patch: https://android-review.googlesource.com/#/c/106523/
The backtraces will hide Art VM machinery for interpreted and AOT code
and display the Java file/line numbers for Java code, while displaying
native backtrace info for native frames. Effectively the developer will
get an app-centric view of the call stack.
This script is not yet tested on device-side Art nor is it tested on
any architecture other than x86_64 or x86 32-bit. Several changes were
needed on the AOSP side to enable it to work properly for x86_64 and x86,
so it is quite likely we'll need to do something similar for other cpu
architectures as well.
Change by Tong Shen
llvm-svn: 218315
On further investigation, COMDATs should work with .ctors, and the issue
I was hitting probably reproduces with .init_array.
This reverts commit r218287.
llvm-svn: 218313
Summary:
I changed the build so that each ABI header gets its own install rule. This gives us the flexibility to install different headers in different directories.
This also fixes the problem where libstdc++ bits/<header>'s were not being installed under a bits directory.
Test Plan: I tested this patch on linux against libstdc++ and libcxxabi.
Reviewers: danalbert, mclow.lists, jroelofs
Reviewed By: jroelofs
Subscribers: jhunold, cfe-commits
Differential Revision: http://reviews.llvm.org/D5454
llvm-svn: 218309
The function deleteBody() converts the linkage to external and thus destroys
original linkage type value. Lack of correct linkage type causes wrong
relocations to be emitted later.
Calling dropAllReferences() instead of deleteBody() will fix the issue.
Differential Revision: http://reviews.llvm.org/D5415
llvm-svn: 218302
undef in the shuffle mask. This shows up when we're printing comments
during lowering and we still have an IR-level constant hanging around
that models undef.
A nice consequence of this is *much* prettier test cases where the undef
lanes actually show up as undef rather than as a particular set of
values. This also allows us to print shuffle comments in cases that use
undef such as the recently added variable VPERMILPS lowering. Now those
test cases have nice shuffle comments attached with their details.
The shuffle lowering for PSHUFB has been augmented to use undef, and the
shuffle combining has been augmented to comprehend it.
llvm-svn: 218301
trick that I missed.
VPERMILPS has a non-immediate memory operand mode that allows it to do
asymetric shuffles in the two 128-bit lanes. Use this rather than two
shuffles and a blend.
However, it turns out the variable shuffle path to VPERMILPS (and
VPERMILPD, although that one offers no functional differenc from the
immediate operand other than variability) wasn't even plumbed through
codegen. Do such plumbing so that we can reasonably emit
a variable-masked VPERMILP instruction. Also plumb basic comment parsing
and printing through so that the tests are reasonable.
There are still a few tests which don't show the shuffle pattern. These
are tests with undef lanes. I'll teach the shuffle decoding and printing
to handle undef mask entries in a follow-up. I've looked at the masks
and they seem reasonable.
llvm-svn: 218300