The getBinary and getBuffer method now return ordinary pointers of appropriate
const-ness. Ownership is transferred by calling takeBinary(), which returns a
pair of the Binary and a MemoryBuffer.
llvm-svn: 221003
If it has an Address object, it is assumed to be Valid.
Change SBAddress to always have an Address object and check
whether it is valid or not in those case.
This is fixing a subtle problem where we ended up with
a SBAddress with an Address of LLDB_INVALID_ADDRESS could
run through a copy constructor and turn into an SBAddress
with no Address object being backed (because it wasn't
distinguishing between invalid-Address versus no-Address.)
The cost of an Address object is not high and this will be
an easy mistake for someone else to make; I'm fixing
SBAddress so it doesn't come up again.
<rdar://problem/18069407>
llvm-svn: 221002
We need to figure out how to track ptrtoint values all the
way until result is converted back to a pointer in order
to correctly rewrite the pointer type.
llvm-svn: 220997
Summary:
The Itanium ABI approach of using offset-to-top isn't possible with the
MS ABI, it doesn't have that kind of information lying around.
Instead, we do the following:
- Call the virtual deleting destructor with the "don't delete the object
flag" set. The virtual deleting destructor will return a pointer to
'this' adjusted to the most derived class.
- Call the global delete using the adjusted 'this' pointer.
Reviewers: rnk
Subscribers: cfe-commits
Differential Revision: http://reviews.llvm.org/D5996
llvm-svn: 220993
Now that we have initial support for VSX, we can begin adding
intrinsics for programmer access to VSX instructions. This patch
performs the necessary enablement in the front end, and tests it by
implementing intrinsics for minimum and maximum using the vector
double data type.
The main change in the front end is to no longer disallow "vector" and
"double" in the same declaration (lib/Sema/DeclSpec.cpp), but "vector"
and "long double" must still be disallowed. The new intrinsics are
accessed via vec_max and vec_min with changes in
lib/Headers/altivec.h. Note that for v4f32, we already access
corresponding VMX builtins, but with VSX enabled we should use the
forms that allow all 64 vector registers.
The new built-ins are defined in include/clang/Basic/BuiltinsPPC.def.
I've added a new test in test/CodeGen/builtins-ppc-vsx.c that is
similar to, but much smaller than, builtins-ppc-altivec.c. This
allows us to test VSX IR generation without duplicating CHECK lines
for the existing bazillion Altivec tests.
Since vector double is now legal when VSX is available, I've modified
the error message, and changed where we test for it and for vector
long double, since the target machine isn't visible in the old place.
This serendipitously removed a not-pertinent warning about 'long'
being deprecated when used with 'vector', when "vector long double" is
encountered and we just want to issue an error. The existing tests
test/Parser/altivec.c and test/Parser/cxx-altivec.cpp have been
updated accordingly, and I've added test/Parser/vsx.c to verify that
"vector double" is now legitimate with VSX enabled.
There is a companion patch for LLVM.
llvm-svn: 220989
Now that we have initial support for VSX, we can begin adding
intrinsics for programmer access to VSX instructions. This patch adds
basic support for VSX intrinsics in general, and tests it by
implementing intrinsics for minimum and maximum for the vector double
data type.
The LLVM portion of this is quite straightforward. There is a
companion patch for Clang.
llvm-svn: 220988
Our internal test reveals such case should not be transformed:
cmp x17, #3
b.lt .LBB10_15
...
subs x12, x12, #1
b.gt .LBB10_1
where x12 is a liveout, becomes:
cmp x17, #2
b.le .LBB10_15
...
subs x12, x12, #2
b.ge .LBB10_1
Unable to provide test case as it's difficult to reproduce on community branch.
http://reviews.llvm.org/D6048
Patch by Zhaoshi Zheng <zhaoshiz@codeaurora.org>!
llvm-svn: 220987
Summary:
When we are adding field paddings for asan even an empty dtor has to remain in the code,
so we ignore -mconstructor-aliases if the paddings are going to be added.
Test Plan: added a test
Reviewers: rsmith, rnk, rafael
Reviewed By: rafael
Subscribers: cfe-commits
Differential Revision: http://reviews.llvm.org/D6038
llvm-svn: 220986
the runtime rather than trying to fix it up,
because now those types have ivars regardless of
whether they come from "frame variable" or from
expressions.
Patch by Enrico Granata.
llvm-svn: 220982
would fail if the class had no ivars.
- Updated use of the RealizeType API by the class
descriptors to use "for_expression" rather than
the misnamed "allow_unknownanytype."
llvm-svn: 220980
to indicate that we're doing stuff for the expression
parser.
- When for_expression is true, look through @s and find
the actual class rather than just returning id.
- Rename BuildObjCObjectType to BuildObjCObjectPointerType
since it's actually returning an object *pointer* type.
llvm-svn: 220979
This patch adds an optimization in CodeGenPrepare to move an extractelement
right before a store when the target can combine them.
The optimization may promote any scalar operations to vector operations in the
way to make that possible.
** Context **
Some targets use different register files for both vector and scalar operations.
This means that transitioning from one domain to another may incur copy from one
register file to another. These copies are not coalescable and may be expensive.
For example, according to the scheduling model, on cortex-A8 a vector to GPR
move is 20 cycles.
** Motivating Example **
Let us consider an example:
define void @foo(<2 x i32>* %addr1, i32* %dest) {
%in1 = load <2 x i32>* %addr1, align 8
%extract = extractelement <2 x i32> %in1, i32 1
%out = or i32 %extract, 1
store i32 %out, i32* %dest, align 4
ret void
}
As it is, this IR generates the following assembly on armv7:
vldr d16, [r0] @vector load
vmov.32 r0, d16[1] @ cross-register-file copy: 20 cycles
orr r0, r0, #1 @ scalar bitwise or
str r0, [r1] @ scalar store
bx lr
Whereas we could generate much faster code:
vldr d16, [r0] @ vector load
vorr.i32 d16, #0x1 @ vector bitwise or
vst1.32 {d16[1]}, [r1:32] @ vector extract + store
bx lr
Half of the computation made in the vector is useless, but this allows to get
rid of the expensive cross-register-file copy.
** Proposed Solution **
To avoid this cross-register-copy penalty, we promote the scalar operations to
vector operations. The penalty will be removed if we manage to promote the whole
chain of computation in the vector domain.
Currently, we do that only when the chain of computation ends by a store and the
target is able to combine an extract with a store.
Stores are the most likely candidates, because other instructions produce values
that would need to be promoted and so, extracted as some point[1]. Moreover,
this is customary that targets feature stores that perform a vector extract (see
AArch64 and X86 for instance).
The proposed implementation relies on the TargetTransformInfo to decide whether
or not it is beneficial to promote a chain of computation in the vector domain.
Unfortunately, this interface is rather inaccurate for this level of details and
although this optimization may be beneficial for X86 and AArch64, the inaccuracy
will lead to the optimization being too aggressive.
Basically in TargetTransformInfo, everything that is legal has a cost of 1,
whereas, even if a vector type is legal, usually a vector operation is slightly
more expensive than its scalar counterpart. That will lead to too many
promotions that may not be counter balanced by the saving of the
cross-register-file copy. For instance, on AArch64 this penalty is just 4
cycles.
For now, the optimization is just enabled for ARM prior than v8, since those
processors have a larger penalty on cross-register-file copies, and the scope is
limited to basic blocks. Because of these two factors, we limit the effects of
the inaccuracy. Indeed, I did not want to build up a fancy cost model with block
frequency and everything on top of that.
[1] We can imagine targets that can combine an extractelement with other
instructions than just stores. If we want to go into that direction, the current
interfaces must be augmented and, moreover, I think this becomes a global isel
problem.
Differential Revision: http://reviews.llvm.org/D5921
<rdar://problem/14170854>
llvm-svn: 220978
Reuse the PPC64 HVA detection algorithm for ARM and AArch64. This is a
nice code deduplication, since they are roughly identical. A few virtual
method extension points are needed to understand how big an HVA can be
and what element types it can have for a given architecture.
Also make the record expansion code work in the presence of non-virtual
bases.
Reviewed By: uweigand, asl
Differential Revision: http://reviews.llvm.org/D6045
llvm-svn: 220972
Rather than executing this code only needed for an assertion even in a
non-asserts build, just roll the function into the assert. The assertion
text literally describes the two cases so it doesn't seem like this
benefits much from having a separate function (& have to hassle about
ifndef NDEBUG it out, etc)
llvm-svn: 220970
Tested this by #if 0'ing out the pthreads implementation, which
indicated that this fallback was not currently compiling successfully
and applying this patch resolves that.
Patch by Andy Chien.
llvm-svn: 220969
Summary:
Ed Maste found some problems with the commit in D5988. Address most of these.
While here, also add floating point return handling. This doesn't handle
128-bit long double yet. Since I don't have any system that uses it, I don't
currently have plans to implement it.
Reviewers: emaste
Reviewed By: emaste
Subscribers: emaste, lldb-commits
Differential Revision: http://reviews.llvm.org/D6049
llvm-svn: 220963
In a case where we have a no {un,}signed wrap flag on the increment, if
RHS - Start is constant then we can avoid inserting a max operation bewteen
the two, since we can statically determine which is greater.
This allows us to unroll loops such as:
void testcase3(int v) {
for (int i=v; i<=v+1; ++i)
f(i);
}
llvm-svn: 220960
Since block address values can be larger than 2GB in 64-bit code, they
cannot be loaded simply using an @l / @ha pair, but instead must be
loaded from the TOC, just like GlobalAddress, ConstantPool, and
JumpTable values are.
The commit also fixes a bug in PPCLinuxAsmPrinter::doFinalization where
temporary labels could not be used as TOC values, since code would
attempt (and fail) to use GetOrCreateSymbol to create a symbol of the
same name as the temporary label.
llvm-svn: 220959
Since JIT->MCJIT migration, most of the ExecutionEngine interface
became deprecated and/or broken. This especially affected the OCaml
bindings, as runFunction is no longer available, and unlike in C,
it is not possible to coerce a pointer to a function and call it
in OCaml.
In practice, LLVM 3.5 shipped completely unusable
Llvm_executionengine.
The GenericValue interface and runFunction were essentially
a poor man's FFI. As such, this interface was removed and instead
a dependency on ctypes >=0.3 added, which handled platform-specific
aspects of accessing data and calling functions.
The new interface does not expose JIT (which is a shim around MCJIT),
as well as the interpreter (which can't handle a lot of valid IR).
Llvm_executionengine.add_global_mapping is currently unusable
due to PR20656.
llvm-svn: 220957