Summary:
Different range metadata can lead to different optimizations in later
passes, possibly breaking the semantics of the merged function. So range
metadata must be taken into consideration when comparing Load
instructions.
Thanks!
llvm-svn: 211391
When RuntimeDyldELF creates stub functions, it needs to install
relocations that will resolve to the final address of the target
routine. Since those are 16-bit relocs, they need to be applied to the
least-significant halfword of the instruction. On big-endian ppc64,
this means that addresses have to be adjusted by 2, which is what the
code currently does.
However, on a little-endian system, the address must *not* be adjusted;
the least-significant halfword is the first one. This patch updates the
RuntimeDyldELF code to take the target byte order into account.
llvm-svn: 211384
This adds support for several missing PPC64 relocations in the
straight-forward manner to RuntimeDyldELF.cpp.
Note that this actually fixes a failure of a large-model test case on
PowerPC, allowing the XFAIL to be removed.
llvm-svn: 211382
Tests for both thread suffix and no thread suffix execution.
Moved some bit-flipping helper methods from TestLldbGdbServer
into the base GdbRemoteTestCaseBase class.
llvm-svn: 211381
Mixing of AddAvailableValue and GetValueAtEndOfBlock methods of SSAUpdater
leaded to the endless loop generation when the nested loops annotated.
This fixes a bug in the OCL_ML/KNN OpenCV test. The test case is too
complex for FileCheck and would be very fragile.
Patch by: Elena Denisova
llvm-svn: 211374
When small arguments (structures < 8 bytes or "float") are passed in a
stack slot in the ppc64 SVR4 ABI, they must reside in the least
significant part of that slot. On BE, this means that an offset needs
to be added to the stack address of the parameter, but on LE, the least
significant part of the slot has the same address as the slot itself.
For the most part, this is handled in the LLVM back-end, where I just
fixed the LE case in commit r211368.
However, there is one piece of the clang front-end that is also aware of
these stack-slot offsets: PPC64_SVR4_ABIInfo::EmitVAArg. This patch
updates that routine to take endianness into account.
llvm-svn: 211370
+ Collect reduction dependences
+ Introduced TYPE_RED in Dependences.h which can be used to obtain the
reduction dependences
+ Used TYPE_RED to prevent parallelization while we do not have a privatizing
code generation
+ Relax the dependences for non-parallel code generation
+ Add privatization dependences to ensure correctness
+ 12 Test cases to check for reduction and privatization dependences
llvm-svn: 211369
When small arguments (structures < 8 bytes or "float") are passed in a
stack slot in the ppc64 SVR4 ABI, they must reside in the least
significant part of that slot. On BE, this means that an offset needs
to be added to the stack address of the parameter, but on LE, the least
significant part of the slot has the same address as the slot itself.
This changes the PowerPC back-end ABI code to only add the small
argument stack slot offset for BE. It also adds test cases to verify
the correct behavior on both BE and LE.
llvm-svn: 211368
FIXME: This fails on win32 due to ERROR_FILENAME_EXCED_RANGE if the working directory is too deep.
We should make Win32/Path.inc capable of long pathnames with '\\?\'.
llvm-svn: 211363
and is unrelated to the NEON intrinsics in arm_neon.h. On little
endian machines it works fine, however on big endian machines it
exhibits surprising behaviour:
uint32x2_t x = {42, 64};
return vget_lane_u32(x, 0); // Will return 64.
Because of this, explicitly call out that it is unsupported on big
endian machines.
This patch will emit the following warning in big-endian mode:
test.c:3:15: warning: vector initializers are a GNU extension and are not compatible with NEON intrinsics [-Wgnu]
int32x4_t x = {0, 1, 2, 3};
^
test.c:3:15: note: consider using vld1q_s32() to initialize a vector from memory, or vcombine_s32(vcreate_s32(), vcreate_s32()) to initialize from integer constants
1 warning generated.
llvm-svn: 211362
On PowerPC LE the system uses the /lib64/ld64.so.2 dynamic linker name
instead of /lib64/ld64.so.1 (to indicate the ELFv2 ABI version).
This fixes the clang driver to pass the appropriate -dynamic-linker
setting, and adds some more tests to linux-ld.c.
llvm-svn: 211360
There was already partial support for multi-arch on powerpc64le,
but the MultiarchIncludeDirs setting was missing. This patch
adds the appropriate definition, and also extends the
linux-header-search.cpp test case to verify an Ubuntu 14.04
powerpc64le tree.
llvm-svn: 211359
Targets can assume that a target streamer is present, so they have to be able
to construct a null streamer in order to set the target streamer in it to.
Fixes a crash when using the null streamer with arm.
llvm-svn: 211358
Various places in LLVM assume that container size and count are unsigned
and do not use the container size_type. Therefore they break compilation
(or possibly executation) for LP64 systems where size_t is 64 bit while
unsigned is still 32 bit.
If we'll ever that many items in the container size_type could be made
size_t for a specific containers after reviweing its other uses.
llvm-svn: 211353
only 1/0 result like std::set. Some of the LLVM ADT already return unsigned
count(), while others still return bool count().
In continuation to r197879, this patch modifies DenseMap, DenseSet,
ScopedHashTable, ValueMap:: count() to return size_type instead of bool,
1 instead of true and 0 instead of false.
size_type is typedef-ed locally within each class to size_t.
http://reviews.llvm.org/D4018
Reviewed by dblaikie.
llvm-svn: 211350
When adding the implicit compound statement (required for Codegen?), the
end location was previously overridden by the start location, probably
based on the assumptions:
* The location of the compound statement should be the member's location
* The compound statement if present is the last element of a FunctionDecl
This patch changes the location of the compound statement to the
member's end location.
Code review: http://reviews.llvm.org/D4175
llvm-svn: 211344
This patch adds support to recognize patterns such as fadd,fsub,fadd,fsub.../add,sub,add,sub... and
vectorizes them as vector shuffles if they are profitable.
These patterns of vector shuffle can later be converted to instructions such as addsubpd etc on X86.
Thanks to Arnold and Hal for the reviews. http://reviews.llvm.org/D4015
llvm-svn: 211339