In cases of nested module builds, or when you care how long module builds take,
this information was not previously easily available / obvious.
llvm-svn: 219658
The bots can't seem to find an include file. Reverting for now and
I'll look into it in a bit.
This reverts commits r219647 and r219648.
llvm-svn: 219649
We currently read serialized diagnostics directly in the C API, which
makes it difficult to reuse this logic elsewhere. This extracts the
core of the serialized diagnostic parsing logic into a base class that
can be subclassed using a visitor pattern.
llvm-svn: 219647
This change adds UBSan check to upcasts. Namely, when we
perform derived-to-base conversion, we:
1) check that the pointer-to-derived has suitable alignment
and underlying storage, if this pointer is non-null.
2) if vptr-sanitizer is enabled, and we perform conversion to
virtual base, we check that pointer-to-derived has a matching vptr.
llvm-svn: 219642
Fix order of evaluation bug in DynTypedMatcher::constructVariadic().
If it evaluates right-to-left, the vector gets moved before we read the
kind from it.
llvm-svn: 219624
Summary:
Change r219118 fixed the bug for anyOf and eachOf, but it is still
present for unless.
The variadic wrapper doesn't have enough information to know how to
restrict the type. Different operators handle restrict failures in
different ways.
Reviewers: klimek
Subscribers: klimek, cfe-commits
Differential Revision: http://reviews.llvm.org/D5731
llvm-svn: 219622
Some early revisions of the Cortex-A53 have an erratum (835769) whereby it is
possible for a 64-bit multiply-accumulate instruction in AArch64 state to
generate an incorrect result. The details are quite complex and hard to
determine statically, since branches in the code may exist in some
circumstances, but all cases end with a memory (load, store, or prefetch)
instruction followed immediately by the multiply-accumulate operation.
The safest work-around for this issue is to make the compiler avoid emitting
multiply-accumulate instructions immediately after memory instructions and the
simplest way to do this is to insert a NOP.
This patch implements clang options to enable this workaround in the backend.
The work-around code generation is not enabled by default.
llvm-svn: 219604
This patch generates call to "kmpc_push_num_threads(ident_t *loc, kmp_int32 global_tid, kmp_int32 num_threads);" library function before calling "kmpc_fork_call" each time there is an associated "num_threads" clause in the "omp parallel" directive.
Differential Revision: http://reviews.llvm.org/D5145
llvm-svn: 219599
Adds codegen for 'if' clause. Currently only for 'if' clause used with the 'parallel' directive.
If condition evaluates to true, the code executes parallel version of the code by calling __kmpc_fork_call(loc, 1, microtask, captured_struct/*context*/), where loc - debug location, 1 - number of additional parameters after "microtask" argument, microtask - is outlined finction for the code associated with the 'parallel' directive, captured_struct - list of variables captured in this outlined function.
If condition evaluates to false, the code executes serial version of the code by executing the following code:
global_thread_id.addr = alloca i32
store i32 global_thread_id, global_thread_id.addr
zero.addr = alloca i32
store i32 0, zero.addr
kmpc_serialized_parallel(loc, global_thread_id);
microtask(global_thread_id.addr, zero.addr, captured_struct/*context*/);
kmpc_end_serialized_parallel(loc, global_thread_id);
Where loc - debug location, global_thread_id - global thread id, returned by __kmpc_global_thread_num() call or passed as a first parameter in microtask() call, global_thread_id.addr - address of the variable, where stored global_thread_id value, zero.addr - implicit bound thread id (should be set to 0 for serial call), microtask() and captured_struct are the same as in parallel call.
Also this patch checks if the condition is constant and if it is constant it evaluates its value and then generates either parallel version of the code (if the condition evaluates to true), or the serial version of the code (if the condition evaluates to false).
Differential Revision: http://reviews.llvm.org/D4716
llvm-svn: 219597
Previously loop hints such as #pragma loop vectorize_width(#) required a constant. This patch allows a constant expression to be used as well. Such as a non-type template parameter or an expression (2 * c + 1).
Reviewed by Richard Smith
llvm-svn: 219589
While we ran getUnqualifiedType over the catch type,
it isn't enough for array types. Use getUnqualifiedArrayType instead.
This fixes PR21252.
llvm-svn: 219582
and !=) to support mixed complex and real operand types.
This requires removing an assert from SemaChecking, and adding support
both to the constant evaluator and the code generator to synthesize the
imaginary part when needed. This seemed somewhat cleaner than having
just the comparison operators force real-to-complex conversions.
I've added test cases for these operations. I'm really terrified that
there were *no* tests in-tree which exercised this.
This turned up when trying to build R after my change to the complex
type lowering.
llvm-svn: 219570
for complex math.
This should fix the windows build bots that started having trouble here
and generally fix complex libcall emission on targets which use sret for
complex data types. It also makes the code a bit simpler (despite
calling into a much more complex bucket of code).
llvm-svn: 219565
operators where one type is a C complex type, and to emit both the
efficient and correct implementation for complex arithmetic according to
C11 Annex G using this extra information.
For both multiply and divide the old code was writing a long-hand
reduced version of the math without any of the special handling of inf
and NaN recommended by the standard here. Instead of putting more
complexity here, this change does what GCC does which is to emit
a libcall for the fully general case.
However, the old code also failed to do the proper minimization of the
set of operations when there was a mixed complex and real operation. In
those cases, C provides a spec for much more minimal operations that are
valid. Clang now emits the exact suggested operations. This change isn't
*just* about performance though, without minimizing these operations, we
again lose the correct handling of infinities and NaNs. It is critical
that this happen in the frontend based on assymetric type operands to
complex math operations.
The performance implications of this change aren't trivial either. I've
run a set of benchmarks in Eigen, an open source mathematics library
that makes heavy use of complex. While a few have slowed down due to the
libcall being introduce, most sped up and some by a huge amount: up to
100% and 140%.
In order to make all of this work, also match the algorithm in the
constant evaluator to the one in the runtime library. Currently it is
a broken port of the simplifications from C's Annex G to the long-hand
formulation of the algorithm.
Splitting this patch up is very hard because none of this works without
the AST change to preserve non-complex operands. Sorry for the enormous
change.
Follow-up changes will include support for sinking the libcalls onto
cold paths in common cases and fastmath improvements to allow more
aggressive backend folding.
Differential Revision: http://reviews.llvm.org/D5698
llvm-svn: 219557
declaration in the instantiation if the previous declaration came from another
definition of the class template that got merged into the pattern definition.
llvm-svn: 219552
Summary: This fixes PR21235.
Test Plan: Includes an automated test.
Reviewers: hansw
Subscribers: cfe-commits
Differential Revision: http://reviews.llvm.org/D5718
llvm-svn: 219551
We can safely rely on the architecture to distinguish iOS device builds from
iOS simulator builds. We already have code to do that, in fact. This simplifies
some of the error checking for the option handling.
llvm-svn: 219545
When reading a serialized diagnostic location with no file ID, we were
failing to increment the cursor past the rest of the location. This
would lead to the flags and category always appearing blank in such
diagnostics.
This changes the function to unconditionally increment the cursor and
updates the test to check for the correct output instead of testing
that we were doing this wrong. I've also updated the error check to
check for the correct number of fields.
llvm-svn: 219538
auto synthesized because it is synthesized in its super
class. locate property declaration in super class
which will default synthesize the property. rdar://18488727
llvm-svn: 219535
This was previously only used when explicitly requested with a command line
option because it had to work with some old versions of the linker when it
was first introduced. That is ancient history now, and it should be safe to
use the correct option even when using the IPHONEOS_DEPLOYMENT_TARGET
environment variable to specify that the target is the iOS simulator.
Besides updating the test for this, I also added a few more tests for the
iOS linker options.
llvm-svn: 219527
It's possible to construct cases where the first field we are trying to
copy is in the middle of an IR field. In some complicated cases, we
would fail to use an appropriate offset inside the object. Earlier
builds of clang seemed to miscompile the code by copying an insufficient
number of bytes. Up until now, we would assert: the copying offset was
insufficiently aligned.
This fixes PR21232.
llvm-svn: 219524
initializers, and captured VLA types so that we can
answer questions like "is this a bit-field" without
looking at the enclosing DeclContext. NFC.
llvm-svn: 219522
The current VSX feature for PowerPC specifies availability of the VSX
instructions added with the 2.06 architecture version. With 2.07, the
architecture adds new instructions to both the Category:Vector and
Category:VSX instruction sets. Additionally, unaligned vector storage
operations have improved performance.
This patch adds a feature to provide access to the new instructions
and performance capabilities of Power8. For compatibility with GCC,
the feature is controlled via a new -mpower8-vector switch, and the
feature causes the __POWER8_VECTOR__ builtin define to be generated by
the preprocessor.
There is a companion patch for llvm being committed at the same time.
llvm-svn: 219502
Moved CGOpenMPRegionInfo from CGOpenMPRuntime.h to CGOpenMPRuntime.cpp file and reworked the code for this change. Also added processing of ThreadID variable passed as an argument in outlined functions in parallel and task directives.
llvm-svn: 219490
This patch makes class OMPPrivateScope a common class for all private variables. Reworked processing of firstprivate variables (now it is based on OMPPrivateScope too).
llvm-svn: 219486
It turns out that this was never used. Instead we just use the
IPHONEOS_DEPLOYMENT_TARGET variable for both iOS devices and simulator.
rdar://problem/18596744
llvm-svn: 219467
Looks like llvm::sys::path::filename() was canonicalizing my paths
before emitting them for FileCheck to stumble over.
Fix a style nit with r219460 while I'm at it.
llvm-svn: 219464
When building with coverage, -no-integrated-as, and -c, the driver was
emitting -cc1 -coverage-file pointing at a file in /tmp. Ensure the
coverage file is emitted in the same directory as the output file.
llvm-svn: 219460
Make it possible to pass NULL through variadic functions on 64-bit
Windows targets. The Visual C++ headers define NULL to 0, when they
should define it to 0LL on Win64 so that NULL is a pointer-sized
integer.
Fixes PR20949.
Reviewers: thakis, rsmith
Differential Revision: http://reviews.llvm.org/D5480
llvm-svn: 219456
Summary:
There was an assumption that there were no matchers that were overloaded
on matchers and other types of arguments.
This assumption was broken recently with the addition of new matcher
overloads.
Fixes http://llvm.org/PR21226
Reviewers: pcc
Subscribers: klimek, cfe-commits
Differential Revision: http://reviews.llvm.org/D5711
llvm-svn: 219450
Summary:
Remove unnecessary wrapping for the 0 and 1 matcher cases of
makeAllOfComposite(). We don't need a variadic wrapper for those cases.
Refactor TrueMatcher to take advandage of the new conversions between
DynTypedMatcher and Matcher<T>. Also, make it a singleton.
This change improves our clang-tidy related benchmarks by ~12%.
Reviewers: klimek
Subscribers: klimek, cfe-commits
Differential Revision: http://reviews.llvm.org/D5675
llvm-svn: 219431
Summary:
This change adds an experimental flag -fsanitize-address-field-padding=N (0, 1, 2)
to clang and driver. With this flag ASAN will be able to detect some cases of
intra-object-overflow bugs,
see https://code.google.com/p/address-sanitizer/wiki/IntraObjectOverflow
There is no actual functionality here yet, just the flag parsing.
The functionality is being reviewed at http://reviews.llvm.org/D5687
Test Plan: Build and run SPEC, LLVM Bootstrap, Chrome with this flag.
Reviewers: samsonov
Reviewed By: samsonov
Subscribers: cfe-commits
Differential Revision: http://reviews.llvm.org/D5676
llvm-svn: 219417
Assertion failed: "Computed __func__ length differs from type!"
Reworked PredefinedExpr representation with internal StringLiteral field for function declaration.
Differential Revision: http://reviews.llvm.org/D5365
llvm-svn: 219393
Includes parsing and semantic analysis for 'omp teams' directive support from OpenMP 4.0. Adds additional analysis to 'omp target' directive with 'omp teams' directive.
llvm-svn: 219385
Summary:
The current code uses memset to re-initialize EHCleanupScope objects
with breaks the assumptions of the upcoming asan's intra-object-overflow checker.
If there is no DTOR, the new checker will refuse to work.
Test Plan: bootstrap with asan
Reviewers: rnk
Reviewed By: rnk
Subscribers: cfe-commits
Differential Revision: http://reviews.llvm.org/D5656
llvm-svn: 219331
Summary: This fixes PR21155.
Test Plan: The patch includes a test.
Reviewers: rnk
Subscribers: cfe-commits
Differential Revision: http://reviews.llvm.org/D5619
llvm-svn: 219322
This patch generates some helper variables that used as private copies of the corresponding original variables inside an OpenMP 'parallel' directive. These generated variables are initialized by copy using values of the original variables (with the copy constructor, if any). For arrays, initializator is generated for single element and in the codegen procedure this initial value is automatically propagated between all elements of the private copy.
In outlined function, references to original variables are replaced by the references to these private helper variables. At the end of the initialization of the private variables an implicit barier is generated by calling __kmpc_barrier(...) runtime function to be sure that all threads were initialized using original values of the variables.
Differential Revision: http://reviews.llvm.org/D5140
llvm-svn: 219306
Boostrapping LLVM+Clang+LLDB without threshold on object size for
lifetime markers insertion has shown there was no significant change
in compile time, so let the stack slot colorizer do its optimization
for all slots.
llvm-svn: 219303
This patch generates some helper variables that used as private copies of the corresponding original variables inside an OpenMP 'parallel' directive. These generated variables are initialized by copy using values of the original variables (with the copy constructor, if any). For arrays, initializator is generated for single element and in the codegen procedure this initial value is automatically propagated between all elements of the private copy.
In outlined function, references to original variables are replaced by the references to these private helper variables. At the end of the initialization of the private variables an implicit barier is generated by calling __kmpc_barrier(...) runtime function to be sure that all threads were initialized using original values of the variables.
Differential Revision: http://reviews.llvm.org/D5140
llvm-svn: 219297
This patch generates some helper variables that used as private copies of the corresponding original variables inside an OpenMP 'parallel' directive. These generated variables are initialized by copy using values of the original variables (with the copy constructor, if any). For arrays, initializator is generated for single element and in the codegen procedure this initial value is automatically propagated between all elements of the private copy.
In outlined function, references to original variables are replaced by the references to these private helper variables. At the end of the initialization of the private variables an implicit barier is generated by calling __kmpc_barrier(...) runtime function to be sure that all threads were initialized using original values of the variables.
Differential Revision: http://reviews.llvm.org/D5140
llvm-svn: 219295
Clang won't emit any prologues for such functions, so it would assert trying to
codegen the parameter references.
This patch makes Clang check the extended asm inputs and outputs for
references to function parameters.
Differential Revision: http://reviews.llvm.org/D5640
llvm-svn: 219272
Summary:
Previously CodeGen assumed that static locals were emitted before they
could be accessed, which is true for automatic storage duration locals.
However, it is possible to have CodeGen emit a nested function that uses
a static local before emitting the function that defines the static
local, breaking that assumption.
Fix it by creating the static local upon access and ensuring that the
deferred function body gets emitted. We may not be able to emit the
initializer properly from outside the function body, so don't try.
Fixes PR18020. See also previous attempts to fix static locals in
PR6769 and PR7101.
Reviewers: rsmith
Subscribers: cfe-commits
Differential Revision: http://reviews.llvm.org/D4787
llvm-svn: 219265