This patch teaches the backend how to canonicalize a shuffle vectors
according to the rule:
- (shuffle (FADD A, B), (FSUB A, B), Mask) ->
(shuffle (FSUB A, -B), (FADD A, -B), Mask)
Where 'Mask' is:
<0,5,2,7> ;; for v4f32 and v4f64 shuffles.
<0,3> ;; for v2f64 shuffles.
<0,9,2,11,4,13,6,15> ;; for v8f32 shuffles.
In general, ISel only knows how to pattern-match a canonical
'fadd + fsub + blendi' dag node sequence into an ADDSUB instruction.
This new rule allows to convert a non-canonical dag sequence into a
canonical one that will be matched by a single ADDSUB at ISel stage.
The idea of converting a non-canonical ADDSUB into a canonical one by
swapping the first two operands of the shuffle, and then negating the
second operand of the FADD and FSUB, was originally proposed by Hal Finkel.
llvm-svn: 211771
Add support for generating optimization remarks after completing the
detection of Scops.
The goal is to provide end-users with useful hints about opportunities that
help to increase the size of the detected Scops in their code.
By default the remark is unspecified and the debug location is empty. Future
patches have to expand on the messages generated.
This patch brings a simple test case for ReportFuncCall to demonstrate the
feature.
Reports all missed opportunities to increase the size/number of valid
Scops:
clang <...> -Rpass-missed="polly-detect" <...>
opt <...> -pass-remarks-missed="polly-detect" <...>
Reports beginning and end of all valid Scops:
clang <...> -Rpass="polly-detect" <...>
opt <...> -pass-remarks="polly-detect" <...>
Differential Revision: http://reviews.llvm.org/D4171
llvm-svn: 211769
Previously dllimport variables inside of template arguments relied on
not using the C++11 codepath when -fms-compatibility was set.
While this allowed us to achieve compatibility with MSVC, it did so at
the expense of MingW.
Instead, try to use the DeclRefExpr we dig out of the template argument.
If it has the dllimport attribute, accept it and skip the C++11
null-pointer check.
llvm-svn: 211766
This patch enables transforms for
(x + (~(y | c) + 1) --> x - (y | c) if c is even
Differential Revision: http://reviews.llvm.org/D4209
llvm-svn: 211765
Folding a reference to a thread_local variable into another global
variable's initializer is very problematic, there is no relocation that
exists to represent such an access.
llvm-svn: 211762
Improve the warning when building with -fprofile-instr-use and a file
appears not to have been profiled at all. This keys on whether a
function is defined in the main file or not to avoid false negatives
when one includes a header with functions that have been profiled.
llvm-svn: 211760
Summary:
The BSDs and Darwin all forward the whole 'u' group, but gcc only
forwards -u so far as I can tell. I only forward -u, since that's a
minimal change, and many people object to magically recognizing and
forwarding linker arguments.
Reviewers: chandlerc, joerg
Subscribers: cfe-commits
Differential Revision: http://reviews.llvm.org/D4304
llvm-svn: 211756
The *_alt defs for vcmp are used by the InstParser (the asm string in the main
def is used by the InstPrinter) . The former was accepting vector registers
as destination rather than mask registers.
llvm-svn: 211750
string_ostream is a safe and efficient string builder that combines opaque
stack storage with a built-in ostream interface.
small_string_ostream<bytes> additionally permits an explicit stack storage size
other than the default 128 bytes to be provided. Beyond that, storage is
transferred to the heap.
This convenient class can be used in most places an
std::string+raw_string_ostream pair or SmallString<>+raw_svector_ostream pair
would previously have been used, in order to guarantee consistent access
without byte truncation.
The patch also converts much of LLVM to use the new facility. These changes
include several probable bug fixes for truncated output, a programming error
that's no longer possible with the new interface.
llvm-svn: 211749
set on the calling thread.
This allows libclang's indexing threads to propagate their priority to the clang module building threads.
rdar://17459872
llvm-svn: 211747
This was written by:
Albert Wong <ajwong@chromium.org>
Antoine Labour <piman@chromium.org>
Dana Jansen <danakj@chromium.org
Jonathan Roelofs <jonathan@codesourcery.com>
Nico Weber <thakis@chromium.org>
llvm-svn: 211743
The new code will be behind a LIBCXXABI_ARM_EHABI define (so that platforms
that don't want it can continue using e.g. SJLJ). This commit mostly just
adds the LIBCXXABI_ARM_EHABI define.
llvm-svn: 211739
This is a follow-up to David's r211677. For the following code,
we would end up referring to 'foo' in the initializer for 'arr',
and then fail to link, because 'foo' is dllimport and needs to be
accessed through the __imp_?foo.
__declspec(dllimport) extern const char foo[];
const char* f() {
static const char* const arr[] = { foo };
return arr[0];
}
Differential Revision: http://reviews.llvm.org/D4299
llvm-svn: 211736
The vector components were mistakenly using () instead of {}, which caused
all but the last vector component to be dropped on the floor.
Signed-off-by: Aaron Watry <awatry@gmail.com>
Reviewed-by: Jeroen Ketema <j.ketema@imperial.ac.uk>
llvm-svn: 211733
For now, this is only used by its unit tests. It is similar to the API
in llvm::sys::fs::recursive_directory_iterator, but without some of the
more complex features like requesting that the iterator not recurse into
the next directory, for example.
llvm-svn: 211732
If the cmp is in a different basic block, then it is possible that not all
operands of that compare have defined registers. This can happen when one of
the operands to the cmp is a load and the load gets folded into the cmp. In
this case FastISel will skip the load instruction and the vreg is never
defined.
llvm-svn: 211730
Previously, only the starting locations of the candidate interval
and the existing interval were compared. To correctly detect
range intersections, it is necessary to compare the entire range
of both intervals against each other.
Reviewed by: scallanan
Differential Revision: http://reviews.llvm.org/D4286
llvm-svn: 211726
Consider the following code:
template <typename T> class Base {};
class __declspec(dllexport) class Derived : public Base<int> {}
When the base of an exported or imported class is a class template
specialization, MSVC will propagate the dll attribute to the base.
In the example code, Base<int> becomes a dllexported class.
This commit makes Clang do the proopagation when the base hasn't been
instantiated yet, and warns about it being unsupported otherwise.
This is different from MSVC, which allows changing a specialization
back and forth between dllimport and dllexport and seems to let the
last one win. Changing the dll attribute after instantiation would be
hard for us, and doesn't seem to come up in practice, so I think this
is a reasonable limitation to have.
MinGW doesn't do this kind of propagation.
Differential Revision: http://reviews.llvm.org/D4264
llvm-svn: 211725
This situation does bad things when inlined, so I've fixed Clang not to
produce inlinable call sites without locations when the caller has debug
info (in the one case where I could find that this occurred). This
updates the PR20038 test case to be what clang now produces, and readds
the assertion that had to be removed due to this bug.
I've also beefed up the debug info verifier to help diagnose these
issues in the future, and I hope to add checks to the inliner to just
assert-fail if it encounters this situation. If, in the future, we
decide we have to cope with this situation, the right thing to do is
probably to just remove all the DebugLocs from the inlined instructions.
llvm-svn: 211723
With && at the top level of an expression, the last thing done when
emitting the expression was an unconditional jump to the cleanup block.
To reduce the amount of stepping, the DebugLoc is omitted from the
unconditional jump. This is done by clearing the IRBuilder's
"CurrentDebugLocation"*. If this is not set to some non-empty value
before the cleanup block is emitted, the cleanups don't get a location
either. If a call without a location is emitted in a function with debug
info, and that call is then inlined - bad things happen. (without a
location for the call site, the inliner would just leave the inlined
DebugLocs as they were - pointing to roots in the original function, not
inlined into the current function)
Follow up commit to LLVM will ensure that breaking the invariants of the
DebugLoc chains by having chains that don't lead to the current function
will fail assertions, so we shouldn't accidentally slip any of these
cases in anymore. Those assertions may reveal further cases that need to
be fixed in clang, though I've tried to test heavily to avoid that.
* See r128471, r128513 for the code that clears the
CurrentDebugLocation. Simply removing this code or moving the code
into IRBuilder to apply to all unconditional branches would regress
desired behavior, unfortunately.
llvm-svn: 211722