This change will ease the transision to multiple reductions per statement as
we can now distinguish the effects of multiple reductions in the same
statement.
+ Wrapped reduction dependences are used to compute privatization dependences
+ Modified test cases to account for the change
llvm-svn: 211795
This dependency analysis will keep track of memory accesses if they might be
part of a reduction. If not, the dependences are tracked on a statement level.
The main reason to do this is to reduce the compile time while beeing able to
distinguish the effects of reduction and non-reduction accesses.
+ Adjusted two test cases
llvm-svn: 211794
Get the predefined macro for the architecture correct.
cortex-m4: __ARM_ARCH_7EM__
cortex-m3: __ARM_ARCH_7M__
cortex-m0: __ARM_ARCH_6M__
rdar://17420090
llvm-svn: 211792
The default rounding mode to initialize the mode register needs
to be reported to the runtime. Fill in other bits a kernel
may be interested in setting for future use.
llvm-svn: 211791
Otherwise the test evaluates to true on OpenCL 1.1 and earlier. Since we
therefore cannot use the CL_VERSION_?_? macros move them to the proper
position in the top-level header.
llvm-svn: 211787
This commit implements the -fuse-ld= option, so that the user
can specify -fuse-ld=bfd to use ld.bfd.
This commit re-applies r194328 with some test case changes.
It seems that r194328 was breaking macosx or mingw build
because clang can't find ld.bfd or ld.gold in the given sysroot.
We should use -B to specify the executable search path instead.
Patch originally by David Chisnall.
llvm-svn: 211785
Use a container class to store the reject logs. Delegating most calls to
the internal std::map and add a few convenient shortcuts (e.g.,
hasErrors()).
llvm-svn: 211780
includes handling DIR_PWR8 where appropriate
The P7Model Itinerary is currently tied in for use under the P8Model, and will be updated later.
llvm-svn: 211779
Additional compliant GAS names for coprocessor register name
are enabled for all instruction with parameter MCK_CoprocReg:
LDC,LDC2,STC,STC2,CDP,CDP2,MCR,MCR2,MCRR,MCRR2,MRC,MRC2,MRRC,MRRC2
Patch by Andrey Kuharev.
llvm-svn: 211776
This patch teaches the backend how to canonicalize a shuffle vectors
according to the rule:
- (shuffle (FADD A, B), (FSUB A, B), Mask) ->
(shuffle (FSUB A, -B), (FADD A, -B), Mask)
Where 'Mask' is:
<0,5,2,7> ;; for v4f32 and v4f64 shuffles.
<0,3> ;; for v2f64 shuffles.
<0,9,2,11,4,13,6,15> ;; for v8f32 shuffles.
In general, ISel only knows how to pattern-match a canonical
'fadd + fsub + blendi' dag node sequence into an ADDSUB instruction.
This new rule allows to convert a non-canonical dag sequence into a
canonical one that will be matched by a single ADDSUB at ISel stage.
The idea of converting a non-canonical ADDSUB into a canonical one by
swapping the first two operands of the shuffle, and then negating the
second operand of the FADD and FSUB, was originally proposed by Hal Finkel.
llvm-svn: 211771
Add support for generating optimization remarks after completing the
detection of Scops.
The goal is to provide end-users with useful hints about opportunities that
help to increase the size of the detected Scops in their code.
By default the remark is unspecified and the debug location is empty. Future
patches have to expand on the messages generated.
This patch brings a simple test case for ReportFuncCall to demonstrate the
feature.
Reports all missed opportunities to increase the size/number of valid
Scops:
clang <...> -Rpass-missed="polly-detect" <...>
opt <...> -pass-remarks-missed="polly-detect" <...>
Reports beginning and end of all valid Scops:
clang <...> -Rpass="polly-detect" <...>
opt <...> -pass-remarks="polly-detect" <...>
Differential Revision: http://reviews.llvm.org/D4171
llvm-svn: 211769
Previously dllimport variables inside of template arguments relied on
not using the C++11 codepath when -fms-compatibility was set.
While this allowed us to achieve compatibility with MSVC, it did so at
the expense of MingW.
Instead, try to use the DeclRefExpr we dig out of the template argument.
If it has the dllimport attribute, accept it and skip the C++11
null-pointer check.
llvm-svn: 211766
This patch enables transforms for
(x + (~(y | c) + 1) --> x - (y | c) if c is even
Differential Revision: http://reviews.llvm.org/D4209
llvm-svn: 211765
Folding a reference to a thread_local variable into another global
variable's initializer is very problematic, there is no relocation that
exists to represent such an access.
llvm-svn: 211762
Improve the warning when building with -fprofile-instr-use and a file
appears not to have been profiled at all. This keys on whether a
function is defined in the main file or not to avoid false negatives
when one includes a header with functions that have been profiled.
llvm-svn: 211760
Summary:
The BSDs and Darwin all forward the whole 'u' group, but gcc only
forwards -u so far as I can tell. I only forward -u, since that's a
minimal change, and many people object to magically recognizing and
forwarding linker arguments.
Reviewers: chandlerc, joerg
Subscribers: cfe-commits
Differential Revision: http://reviews.llvm.org/D4304
llvm-svn: 211756
The *_alt defs for vcmp are used by the InstParser (the asm string in the main
def is used by the InstPrinter) . The former was accepting vector registers
as destination rather than mask registers.
llvm-svn: 211750
string_ostream is a safe and efficient string builder that combines opaque
stack storage with a built-in ostream interface.
small_string_ostream<bytes> additionally permits an explicit stack storage size
other than the default 128 bytes to be provided. Beyond that, storage is
transferred to the heap.
This convenient class can be used in most places an
std::string+raw_string_ostream pair or SmallString<>+raw_svector_ostream pair
would previously have been used, in order to guarantee consistent access
without byte truncation.
The patch also converts much of LLVM to use the new facility. These changes
include several probable bug fixes for truncated output, a programming error
that's no longer possible with the new interface.
llvm-svn: 211749
set on the calling thread.
This allows libclang's indexing threads to propagate their priority to the clang module building threads.
rdar://17459872
llvm-svn: 211747