This patch attempts to fold the shuffling of 'scalar source' inputs - BUILD_VECTOR and SCALAR_TO_VECTOR nodes - if the shuffle node is the only user. This folds away a lot of unnecessary shuffle nodes, and allows quite a bit of constant folding that was being missed.
Differential Revision: http://reviews.llvm.org/D8516
llvm-svn: 234004
Fixes PR19582.
Previously, when an asm assignment (.set or =) was created, we would look up
the section immediately in MCSymbol::setVariableValue. This caused symbols
to receive the wrong section if the RHS of the assignment had not been seen
yet. This had a knock-on effect in the object file emitters, causing them
to emit extra symbols, or to give symbols the wrong visibility or the wrong
section. For example, in the following asm:
.data
.Llocal:
.text
leaq .Llocal1(%rip), %rdi
.Llocal1 = .Llocal2
.Llocal2 = .Llocal
the first assignment would give .Llocal1 a null section, which would never get
fixed up by the second assignment. This would cause the ELF object file emitter
to consider .Llocal1 to be an undefined symbol and give it external linkage,
even though .Llocal1 should not have been emitted at all in the object file.
Or in the following asm:
alias_to_local = Ltmp0
Ltmp0:
the Mach-O object file emitter would give the alias_to_local symbol a n_type
of N_SECT and a n_sect of 0. This is invalid under the Mach-O specification,
which requires N_SECT symbols to receive a non-zero section number if the
symbol is defined in a section in the object file.
https://developer.apple.com/library/mac/documentation/DeveloperTools/Conceptual/MachORuntime/#//apple_ref/c/tag/nlist
After this change we do not look up the section when the assignment is created,
but instead look it up on demand and store it in Section, which is treated
as a cache if the symbol is a variable symbol.
This change also fixes a bug in MCExpr::FindAssociatedSection. Previously,
if we saw a subtraction, we would return the first referenced section, even in
cases where we should have been returning the absolute pseudo-section. Now we
always return the absolute pseudo-section for expressions that subtract two
section-derived expressions. This isn't always correct (e.g. if one of the
sections ends up being laid out at an absolute address), but it's probably
the best we can do without more context.
This allows us to remove code in two places where we appear to have been
working around this bug, in MachObjectWriter::markAbsoluteVariableSymbols
and in X86AsmPrinter::EmitStartOfAsmFile.
Re-applies r233595 (aka D8586), which was reverted in r233898.
Differential Revision: http://reviews.llvm.org/D8798
llvm-svn: 233995
Summary:
Same as the last patch, but for BasicBlock
(Requires same code movement)
Reviewers: chandlerc
Subscribers: llvm-commits
Differential Revision: http://reviews.llvm.org/D8801
llvm-svn: 233992
Summary:
Updated test to reflect that Linux and Darwin behave the same now.
Removed @expectedFailureLinux for passing tests.
Test Plan: run tests
Reviewers: clayborg, sivachandra
Subscribers: lldb-commits
Differential Revision: http://reviews.llvm.org/D8678
llvm-svn: 233989
DWARF types and types from modules can coexist even
if they have the same name and refer to two different
things.
<rdar://problem/18805055>
llvm-svn: 233988
Register coalescing can change the target of a RegPair hint to a
physreg, we should not crash on this. This also slightly improved the
way ARMBaseRegisterInfo::updateRegAllocHint() works.
llvm-svn: 233987
Summary:
Currently there are bugs in out detection of multi-level pointer conversions and pointer to member conversions. This patch fixes the following issues.
* Allow multi-level pointers with different nested qualifiers.
* Allow multi-level mixed pointers to objects and pointers to members with different nested qualifiers.
* Allow conversions from `int Base::*` to `int Derived::*` but only for non-nested pointers.
There is still some work that needs to be done to clean this patch up but I want to get some input on it.
Open questions:
* Does `__pointer_to_member_type_info::can_catch(...)` need to adjust the pointer if a base to derived conversion is performed?
Reviewers: danalbert, compnerd, mclow.lists
Reviewed By: mclow.lists
Subscribers: cfe-commits
Differential Revision: http://reviews.llvm.org/D8758
llvm-svn: 233984
This prevents us from running out of registers in the backend.
Introducing stack malloc calls prevents the backend from recognizing the
inline asm operands as stack objects. When the backend recognizes a
stack object, it doesn't need to materialize the address of the memory
in a physical register. Instead it generates a simple SP-based memory
operand. Introducing a stack malloc forces the backend to find a free
register for every memory operand. 32-bit x86 simply doesn't have enough
registers for this to succeed in most cases.
Reviewers: kcc, samsonov
Differential Revision: http://reviews.llvm.org/D8790
llvm-svn: 233979
Summary:
The old requirement on GEP candidates being in bounds is unnecessary.
For off-bound GEPs, we still have
&B[i * S] = B + (i * S) * e = B + (i * e) * S
Test Plan: slsr_offbound_gep in slsr-gep.ll
Reviewers: meheff
Subscribers: llvm-commits
Differential Revision: http://reviews.llvm.org/D8809
llvm-svn: 233949
result_type is always ErrorOr<unique_ptr<File>>, and since the type traits
is for instantiating ELF files, it's unlikely that we want to return
something else. This patch removes that type.
llvm-svn: 233948
This makes it possible to use the same representation of llvm.eh.actions
in outlined handlers as we use in the parent function because i32's are
just constants that can be copied freely between functions.
I had to add a sentinel alloca to the list of child allocas so that we
don't try to sink the catch object into the handler. Normally, one would
use nullptr for this kind of thing, but TinyPtrVector doesn't support
null elements. More than that, it's elements have to have a suitable
alignment. Therefore, I settled on this for my sentinel:
AllocaInst *getCatchObjectSentinel() {
return static_cast<AllocaInst *>(nullptr) + 1;
}
llvm-svn: 233947
The test class 'G' reads and writes to the same static variables in its
constructor, destructor and call operator. When threads are
constructed using `std::thread t((G()))` there is a race condition between the
destruction of the temporary and the execution of `G::operator()()`.
The fix is to simply create the input before creating the thread.
llvm-svn: 233946
Summary:
The summary of the bug, provided by Stephan T. Lavavej:
In shared_timed_mutex::try_lock_until() (line 195 in 3.6.0), you need to deliver a notification. The scenario is:
* There are N threads holding the shared lock.
* One thread calls try_lock_until() to attempt to acquire the exclusive lock. It sets the "I want to write" bool/bit, then waits for the N readers to drain away.
* K more threads attempt to acquire the shared lock, but they notice that someone said "I want to write", so they block on a condition_variable.
* At least one of the N readers is stubborn and doesn't release the shared lock.
* The wannabe-writer times out, gives up, and unsets the "I want to write" bool/bit.
At this point, a notification (it needs to be notify_all) must be delivered to the condition_variable that the K wannabe-readers are waiting on. Otherwise, they can block forever without waking up.
Reviewers: mclow.lists, jyasskin
Reviewed By: jyasskin
Subscribers: jyasskin, cfe-commits
Differential Revision: http://reviews.llvm.org/D8796
llvm-svn: 233944
There were a couple of real bugs here regarding error checking and
signed/unsigned comparisons, but mostly these were just noise.
There was one class of bugs fixed here which is particularly
annoying, dealing with MSVC's non-standard behavior regarding
the underlying type of enums. See the comment in
lldb-enumerations.h for details. In short, from now on please use
FLAGS_ENUM and FLAGS_ANONYMOUS_ENUM when defining enums which
contain values larger than can fit into a signed integer.
llvm-svn: 233943
Without this patch, we split the 256-bit vector into halves and produced something like:
movzwl (%rdi), %eax
vmovd %eax, %xmm0
vxorps %xmm1, %xmm1, %xmm1
vblendps $15, %ymm0, %ymm1, %ymm0 ## ymm0 = ymm0[0,1,2,3],ymm1[4,5,6,7]
Now, we eliminate the xor and blend because those zeros are free with the vmovd:
movzwl (%rdi), %eax
vmovd %eax, %xmm0
This should be the final fix needed to resolve PR22685:
https://llvm.org/bugs/show_bug.cgi?id=22685
llvm-svn: 233941
Require the pointee type to be passed explicitly and assert that it is
correct. For now it's possible to pass nullptr here (and I've done so in
a few places in this patch) but eventually that will be disallowed once
all clients have been updated or removed. It'll be a long road to get
all the way there... but if you have the cahnce to update your callers
to pass the type explicitly without depending on a pointer's element
type, that would be a good thing to do soon and a necessary thing to do
eventually.
llvm-svn: 233938
Now the GEP constant utility functions require the type to be explicitly
passed (since eventually the pointer type will be opaque and not convey
the required type information). For now callers can still pass nullptr
(though none were needed here in Clang, which is nice) if
convenienc/necessary, but eventually that will be disallowed as well.
llvm-svn: 233937
Guard against this by setting a new "m_finalizing" flag that lets us know we are in the process of finalizing.
<rdar://problem/20369152>
llvm-svn: 233935
There was a lot of code that was checking "if self.getPlatform() == 'darwin'" which is not correct. I fixed this by adding a:
lldbtest.platformIsDarwin()
which returns true if the current platform's OS is "macosx", "ios" or "darwin". These three valid darwin are now returned by a static function:
lldbtest.getDarwinOSTriples()
Fixed up all places that has 'if self.getPlatform() == "darwin":' with "if self.platformIsDarwin()" and all instances of 'if self.getPlatform() != "darwin":' with "if not self.platformIsDarwin()". I also fixed some darwin decorator functions to do the right thing as well.
llvm-svn: 233933
For code like this:
define <8 x i32> @load_v8i32() {
ret <8 x i32> <i32 7, i32 0, i32 0, i32 0, i32 0, i32 0, i32 0, i32 0>
}
We produce this AVX code:
_load_v8i32: ## @load_v8i32
movl $7, %eax
vmovd %eax, %xmm0
vxorps %ymm1, %ymm1, %ymm1
vblendps $1, %ymm0, %ymm1, %ymm0 ## ymm0 = ymm0[0],ymm1[1,2,3,4,5,6,7]
retq
There are at least 2 bugs in play here:
We're generating a blend when a move scalar does the same job using 2 less instruction bytes (see FIXMEs).
We're not matching an existing pattern that would eliminate the xor and blend entirely. The zero bytes are free with vmovd.
The 2nd fix involves an adjustment of "AddedComplexity" [1] and mostly masks the 1st problem.
[1] AddedComplexity has close to no documentation in the source.
The best we have is this comment: "roughly corresponds to the number of nodes that are covered".
It appears that x86 has bastardized this definition by inflating its values for some other
undocumented reason. For example, we have a pattern with "AddedComplexity = 400" (!).
I searched my way to this page:
https://groups.google.com/forum/#!topic/llvm-dev/5UX-Og9M0xQ
Differential Revision: http://reviews.llvm.org/D8794
llvm-svn: 233931
MSVC 2013 requires the argument to __declspec(align()) to be an integer
constant expression that doesn't involve any identifiers like sizeof.
For GCC and Clang, LLVM_PTR_SIZE is equivalent to __SIZEOF_POINTER__,
which dates back to GCC 4.6 and Clang 2010. If that's not available, we
get sizeof(void*), which works with alignas() and
__attribute__((aligned())).
For MSVC, LLVM_PTR_SIZE is 4 or 8 depending on _WIN64.
llvm-svn: 233929