extended loads.
Implement the related target lowering hook so that the optimization has a better
estimation of the cost of an extension.
rdar://problem/19267165
llvm-svn: 233753
Uniqued nodes have more complete registration with
`ReplaceableMetadataImpl` so that they can update themselves when
operands change. Fix a bug where `MDNode::replaceWithUniqued()` wasn't
enabling these callbacks.
The two most obvious ways missing callbacks causes problems is that
auto-resolution fails and re-uniquing (on changed operands) just doesn't
happen. I've added tests for both -- in both cases, I confirmed that
the final check was failing before the fix.
rdar://problem/20365935
llvm-svn: 233751
The existing code in getMemsetValue only handled integer-preferred types when
the fill value was not a constant. Make this more robust in two ways:
1. If the preferred type is a floating-point value, do the mul-splat trick on
the corresponding integer type and then bitcast.
2. If the preferred type is a vector, do the mul-splat trick on one vector
element, and then build a vector out of them.
Fixes PR22754 (although, we should also turn off use of vector types at -O0).
llvm-svn: 233749
This patch fixes MCJIT::addGlobalMapping by changing the implementation of the
ExecutionEngineState class. The new implementation maintains a bidirectional
mapping between symbol names (std::strings) and addresses (uint64_ts), rather
than a mapping between Value*s and void*s.
This has fix has been made for backwards compatibility, however the strongly
preferred way to resolve unknown symbols is by writing a custom
RuntimeDyld::SymbolResolver (formerly RTDyldMemoryManager) and overriding the
findSymbol method. The addGlobalMapping method is a hangover from the legacy JIT
(which has was removed in 3.6), and may be deprecated in a future release as
part of a clean-up of the ExecutionEngine interface.
Patch by Murat Bolat. Thanks Murat!
llvm-svn: 233747
Specify an allocation order with a register class. This is used by register
allocators with a greedy heuristic. This is usefull as it is sometimes
beneficial to color more constrained classes first.
Differential Revision: http://reviews.llvm.org/D8626
llvm-svn: 233743
When allocating live intervals in linear order and all of them are local
to a single basic block you get an optimal coloring. This is also true
if you reverse the order, but it is not true if you sort live ranges
beginnings in reverse order, change to sort live range endings in
reverse order. Take the following live ranges for example:
|---| |--------|
|----------| |-------|
They get colored suboptimally with 3 registers if you sort the live range
starting points in reverse order (but optimally with live range begins in order,
or live range ends in reverse order).
Apparently the previous strategy was intentional because of allocation
time considerations. I am having a hard time replicating these effects,
while I see substantial improvements in allocation quality with this
change.
No testcase as none of the (in tree) targets use reverse order mode.
Differential Revision: http://reviews.llvm.org/D8625
llvm-svn: 233742
Change lowerCTPOP to:
- Gracefully handle a known-zero input value
- Simplify computation of significant bit size
Thanks to Jay Foad for the review!
llvm-svn: 233736
These regression tests are supposed to test small code model support, but have
been XFAIL'd because we don't have an in-tree memory manager that can guarantee
a small-code-model compatible memory layout. Unfortunately, they can
occasionally pass if they get lucky with memory allocation, causing unexpected
passes on the bots. That's not very helpful.
I'm going to remove these until we have the infrastructure (small-code-model
compatible memory manager) to run them properly.
llvm-svn: 233722
Multiple inheritance is casually used here. Rewriting to not
using multiple inheritance reduces the complexity of the code
and also makes it shorter.
llvm-svn: 233718
Removed expectedFailureLinux from failures that I was unable to
reproduce, updated and improved some other comments near XFAIL tests
Differential Revision: http://reviews.llvm.org/D8676
llvm-svn: 233716
Summary:
In certain cases vector can use memcpy to construct a range of elements at the back of the vector. We currently don't do this resulting in terrible code gen in non-optimized mode and a
very large slowdown compared to libstdc++.
This patch adds a `__construct_forward_range(Allocator, Iter, Iter, _Ptr&)` and `__construct_forward_range(Allocator, Tp*, Tp*, Tp*&)` functions to `allocator_traits` which act similarly to the existing `__construct_forward(...)` functions.
This patch also changes vectors `__construct_at_end(Iter, Iter)` to be `__construct_at_end(Iter, Iter, SizeType)` where SizeType is the size of the range. `__construct_at_end(Iter, Iter, SizeType)` now calls `allocator_traits<Tp>::__construct_forward_range(...)`.
This patch is based off the design of `__swap_out_circular_buffer(...)` which uses `allocator_traits<Tp>::__construct_forward(...)`.
On my machine this code performs 4x better than the current implementation when tested against `std::vector<int>`.
Reviewers: howard.hinnant, titus, kcc, mclow.lists
Reviewed By: mclow.lists
Subscribers: cfe-commits
Differential Revision: http://reviews.llvm.org/D8109
llvm-svn: 233711
On FreeBSD LLDB's triple ends up as e.g. "x86_64-unknown-freebsd10.1"
but getPlatform() consumers expect just the name with no version
number.
llvm-svn: 233705
I suggested this change in D7898 (http://llvm.org/viewvc/llvm-project?view=revision&revision=231354)
It improves the v4i64 case although not optimally. This AVX codegen:
vmovq {{.*#+}} xmm0 = mem[0],zero
vxorpd %ymm1, %ymm1, %ymm1
vblendpd {{.*#+}} ymm0 = ymm0[0],ymm1[1,2,3]
Becomes:
vmovsd {{.*#+}} xmm0 = mem[0],zero
Unfortunately, this doesn't completely solve PR22685. There are still at least 2 problems under here:
We're not handling v32i8 / v16i16.
We're not getting the FP / int domains right for instruction selection.
But since this patch alone appears to do no harm, reduces code duplication, and helps v4i64,
I'm submitting this patch ahead of fixing the above.
Differential Revision: http://reviews.llvm.org/D8341
llvm-svn: 233704
A temp object was being created to call StripOffFileName. This function
is not dependent on class members so I am making it static.
No regression on testsuite on Linux.
llvm-svn: 233703
Use "constructors that are callable with a single argument" instead of
"single-argument constructors" when referring to constructors using default
arguments or parameter packs.
llvm-svn: 233702
Summary:
Force braces on the else branch if they are being added to the if branch.
This ensures consistency in the transformed code.
Reviewers: alexfh
Subscribers: cfe-commits
Differential Revision: http://reviews.llvm.org/D8708
llvm-svn: 233697