Extend the __declspec(dll*) attribute to cover ObjC interfaces. This was
requested by Microsoft for their ObjC support. Cover both import and export.
This only adds the semantic analysis portion of the support, code-generation
still remains outstanding. Add some basic initial documentation on the
attributes that were previously empty. Tweak the previous tests to use the
relative expected-warnings to make the tests easier to read.
llvm-svn: 275610
This changes the CompilerInstance::createOutputFile function to return
a std::unique_ptr<llvm::raw_ostream>, rather than an llvm::raw_ostream
implicitly owned by the CompilerInstance. This in most cases required that
I move ownership of the output stream to the relevant ASTConsumer.
The motivation for this change is to allow BackendConsumer to be a client
of interfaces such as D20268 which take ownership of the output stream.
Differential Revision: http://reviews.llvm.org/D21537
llvm-svn: 275507
passed on the command line but never actually used. We consider a (top-level)
module to be used if any part of it is imported, either by the current
translation unit, or by any part of a top-level module that is itself used.
(Put another way, a module is used if an implicit modules build would have
loaded its .pcm file.)
llvm-svn: 275481
This patch implements PR#22821.
Taking the address of a packed member is dangerous since the reduced
alignment of the pointee is lost. This can lead to memory alignment
faults in some architectures if the pointer value is dereferenced.
This change adds a new warning to clang emitted when taking the address
of a packed member. A packed member is either a field/data member
declared as attribute((packed)) or belonging to a struct/class
declared as such. The associated flag is -Waddress-of-packed-member.
Conversions (either implicit or via a valid casting) to pointer types
with lower or equal alignment requirements (e.g. void* or char*)
silence the warning.
This change also adds a new error diagnostic when the user attempts to
bind a reference to a packed member, regardless of the alignment.
Differential Revision: https://reviews.llvm.org/D20561
llvm-svn: 275417
This patch is to implement sema and parsing for 'target parallel for simd' pragma.
Differential Revision: http://reviews.llvm.org/D22096
llvm-svn: 275365
-fxray-instrument: enables XRay annotation of IR
-fxray-instruction-threshold: configures the threshold for function size (looking at IR instructions), and allow LLVM to decide whether to add the nop sleds later on in the process.
Also implements the related xray_always_instrument and xray_never_instrument function attributes.
Patch by Dean Michael Berris.
llvm-svn: 275330
This encourages checkers to make logical decisions depending on
value of which region was the symbol under consideration
introduced to denote.
A similar technique is already used in a couple of checkers;
they were modified to call the new method.
Differential Revision: http://reviews.llvm.org/D22242
llvm-svn: 275290
http://reviews.llvm.org/D21904
This patch is similar to the implementation of 'private' clause: it adds a list of private pointers to be used within the target data region to store the device pointers returned by the runtime.
Please refer to the following document for a full description of what the runtime witll return in this case (page 10 and 11):
https://github.com/clang-omp/OffloadingDesign
I am happy to answer any question related to the runtime interface to help reviewing this patch.
llvm-svn: 275271
This is to allow distributed build systems, that do not preserve time stamps, to use PCH files.
Second and last part of the patch proposed at:
Differential Revision: http://reviews.llvm.org/D20867
llvm-svn: 275267
We need to mark the appropriate bits in ThrowInfo and HandlerType so
that the personality routine can correctly handle qualification
conversions.
llvm-svn: 275154
Summary:
return llvm::Expected<> to carry error status and error information.
This is the first step towards introducing "Error" into tooling::Replacements.
Reviewers: djasper, klimek
Subscribers: ioeric, klimek, cfe-commits
Differential Revision: http://reviews.llvm.org/D21601
llvm-svn: 275062
- Changes diagnostics for Blocks to be implicitly
const qualified OpenCL v2.0 s6.12.5.
- Added and unified diagnostics of some OpenCL special types:
blocks, images, samplers, pipes. These types are intended for use
with the OpenCL builtin functions only and, therefore, most regular
uses are not allowed including assignments, arithmetic operations,
pointer dereferencing, etc.
Review: http://reviews.llvm.org/D21989
llvm-svn: 275061
MASM (ML.exe and ML64.exe) and older versions of MSVC (CL.exe) support a
flag called /Zd which is more-or-less -gline-tables-only.
It seems nicer to support this flag instead of exposing
-gline-tables-only.
llvm-svn: 274991
Add OCL option -cl-no-signed-zeros to driver options.
Also added to opencl.cl testcases.
Patch by Aaron En Ye Shi.
Differential Revision: http://reviews.llvm.org/D22067
llvm-svn: 274923
OpenCL s6.6: "Access qualifier must be used with image object arguments
of kernels and of user-defined functions [...] If no qualifier is
provided, read_only is assumed".
This does not define the behavior for image types used in typedef
declaration, but following the spec logic, we should allow access
qualifiers specification in typedefs, e.g.:
typedef write_only image1d_t img1d_wo;
Unlike cv-qualifiers, user cannot add access qualifier to a typedef
type, i.e. this is not allowed:
typedef image1d_t img1d; // note: previously declared 'read_only' here
void foo(write_only img1d im) {} // error: multiple access qualifier
Patch by Andrew Savonichev.
Reviewers: Anastasia Stulova.
Differential revision: http://reviews.llvm.org/D20948
llvm-svn: 274858
Original commit message:
"Add postorder traversal support to the RecursiveASTVisitor.
This feature needs to be explicitly enabled by overriding shouldTraversePostOrder()
as it has performance drawbacks for the iterative Stmt-traversal.
Patch by Raphael Isemann!
Reviewed by Richard Smith and Benjamin Kramer."
llvm-svn: 274830
This proposed patch adds crude handling of atomics to the static analyzer.
Rather than ignore AtomicExprs, as we now do, this patch causes the analyzer
to escape the arguments. This is imprecise -- and we should model the
expressions fully in the future -- but it is less wrong than ignoring their
effects altogether.
This is rdar://problem/25353187
Differential Revision: http://reviews.llvm.org/D21667
llvm-svn: 274816
Summary:
Raise an error if you're using a CUDA installation that's too old for
the requested architectures. In practice, this means that you need a
CUDA 8 install to compile for sm_6*.
Reviewers: tra
Subscribers: cfe-commits
Differential Revision: http://reviews.llvm.org/D21869
llvm-svn: 274781
The builtin was renamed in r274770. But __syncthreads is part of our
user-facing API, so we need to keep the name as-is.
Patch by Justin Bogner.
llvm-svn: 274780
Summary:
Currently our handling of CUDA architectures is scattered all around
clang. This patch centralizes it.
A key advantage of this centralization is that you can now write a C++
switch on e.g. CudaArch and get a compile error if you don't handle one
of the enum values.
Reviewers: tra
Subscribers: cfe-commits
Differential Revision: http://reviews.llvm.org/D21867
llvm-svn: 274681
Summary: This patch is an implementation of sema and parsing for the OpenMP composite pragma 'distribute simd'.
Differential Revision: http://reviews.llvm.org/D22007
llvm-svn: 274604
- Added new Builtins: enqueue_kernel, get_kernel_work_group_size
and get_kernel_preferred_work_group_size_multiple.
These Builtins use custom check to diagnose parameters of the passed Blocks
i. e. variable number of 'local void*' type params, and check different
overloads specified in Table 6.31 of OpenCL v2.0.
- IR is generated as an internal library call for each OpenCL Builtin,
reusing ObjC Block implementation.
Review: http://reviews.llvm.org/D20249
llvm-svn: 274540
Summary: This patch is an implementation of sema and parsing for the OpenMP composite pragma 'distribute parallel for simd'.
Differential Revision: http://reviews.llvm.org/D21977
llvm-svn: 274530
Currently we only have OpenCL 2.0 Builtins i.e. pipes or address space conversions.
They have to be added only in the version 2.0 compilation mode to make the identifiers
available for use in the other versions.
Review: http://reviews.llvm.org/D20249
llvm-svn: 274509
member is redundantly redeclared outside the class definition in code built in
c++17 mode, ensure we emit a non-discardable definition of the data member for
c++11 and c++14 compilations to use.
llvm-svn: 274416
This feature needs to be explicitly enabled by overriding shouldTraversePostOrder()
as it has performance drawbacks for the iterative Stmt-traversal.
Patch by Raphael Isemann!
Reviewed by Richard Smith and Benjamin Kramer.
llvm-svn: 274348
This patch adds a __nth_element builtin that allows fetching the n-th type of a
parameter pack with very little compile-time overhead. The patch was inspired by
r252036 and r252115 by David Majnemer, which add a similar __make_integer_seq
builtin for efficiently creating a std::integer_sequence.
Reviewed as D15421. http://reviews.llvm.org/D15421
llvm-svn: 274316
Summary: This patch changes the options used by offloading to start with -fopenmp instead of -fomp. This makes the option naming more consistent and materializes a suggestion by Richard Smith in http://reviews.llvm.org/D9888.
Reviewers: hfinkel, carlo.bertolli, arpith-jacob, ABataev
Subscribers: kkwli0, cfe-commits, caomhin
Differential Revision: http://reviews.llvm.org/D21841
llvm-svn: 274283
Summary:
Summary:
Change Clang calling convention SpirKernel to OpenCLKernel.
Set calling convention OpenCLKernel for amdgcn as well.
Add virtual method .getOpenCLKernelCallingConv() to TargetCodeGenInfo
and use it to set target calling convention for AMDGPU and SPIR.
Update tests.
Reviewers: rsmith, tstellarAMD, Anastasia, yaxunl
Subscribers: kzhuravl, cfe-commits
Differential Revision: http://reviews.llvm.org/D21367
llvm-svn: 274220
No semantic analysis yet.
This is a pain to disambiguate correctly, because the parsing rules for the
declaration form of a condition and of an init-statement are quite different --
for a token sequence that looks like a declaration, we frequently need to
disambiguate all the way to the ')' or ';'.
We could do better here in some cases by stopping disambiguation once we've
decided whether we've got an expression or not (rather than keeping going until
we know whether it's an init-statement declaration or a condition declaration),
by unifying our parsing code for the two types of declaration and moving the
syntactic checks into Sema; if this has a measurable impact on parsing
performance, I'll look into that.
llvm-svn: 274169
Allow -cl-std and other standard -cl- options from cc1 to driver.
Added a test for the options moved.
Patch by Aaron En Ye Shi.
Differential Revision: http://reviews.llvm.org/D21031
llvm-svn: 274150
constructor would be; this is effectively required by P0136R1. This has the
effect of exposing the validity of the base class initialization steps to
SFINAE checks.
llvm-svn: 274088
When a class property is accessed with an object instance, before this commit,
we try to apply a typo correction of the same property:
property 'c' not found on object of type 'A *'; did you mean 'c'?
With this commit, we correctly emit a diagnostics:
property 'c' is a class property; did you mean to access it with class 'A'?
rdar://26866973
llvm-svn: 274076
We continue accepting "macosx" but canonicalize it to "macos", When emitting
diagnostics, we use "macOS" instead of "OS X".
The PlatformName in TargetInfo is changed from "macosx" to "macos" so we can
directly compare the Platform in AvailabilityAttr with the PlatformName
in TargetInfo.
rdar://26795172
rdar://26800775
llvm-svn: 274064
Replace inheriting constructors implementation with new approach, voted into
C++ last year as a DR against C++11.
Instead of synthesizing a set of derived class constructors for each inherited
base class constructor, we make the constructors of the base class visible to
constructor lookup in the derived class, using the normal rules for
using-declarations.
For constructors, UsingShadowDecl now has a ConstructorUsingShadowDecl derived
class that tracks the requisite additional information. We create shadow
constructors (not found by name lookup) in the derived class to model the
actual initialization, and have a new expression node,
CXXInheritedCtorInitExpr, to model the initialization of a base class from such
a constructor. (This initialization is special because it performs real perfect
forwarding of arguments.)
In cases where argument forwarding is not possible (for inalloca calls,
variadic calls, and calls with callee parameter cleanup), the shadow inheriting
constructor is not emitted and instead we directly emit the initialization code
into the caller of the inherited constructor.
Note that this new model is not perfectly compatible with the old model in some
corner cases. In particular:
* if B inherits a private constructor from A, and C uses that constructor to
construct a B, then we previously required that A befriends B and B
befriends C, but the new rules require A to befriend C directly, and
* if a derived class has its own constructors (and so its implicit default
constructor is suppressed), it may still inherit a default constructor from
a base class
llvm-svn: 274049
Summary:
Currently output of child process, however in my use case, it
needs to be captured and presented to the user.
Add Redirect method to Compilation and use existing infrastructure
for redirecting output of commands.
Reviewers: tstellarAMD
Subscribers: cfe-commits
Differential Revision: http://reviews.llvm.org/D21224
llvm-svn: 273997
[OpenMP] Initial implementation of parse and sema for composite pragma 'distribute parallel for'
This patch is an initial implementation for #distribute parallel for.
The main differences that affect other pragmas are:
The implementation of 'distribute parallel for' requires blocking of the associated loop, where blocks are "distributed" to different teams and iterations within each block are scheduled to parallel threads within each team. To implement blocking, sema creates two additional worksharing directive fields that are used to pass the team assigned block lower and upper bounds through the outlined function resulting from 'parallel'. In this way, scheduling for 'for' to threads can use those bounds.
As a consequence of blocking, the stride of 'distribute' is not 1 but it is equal to the blocking size. This is returned by the runtime and sema prepares a DistIncrExpr variable to hold that value.
As a consequence of blocking, the global upper bound (EnsureUpperBound) expression of the 'for' is not the original loop upper bound (e.g. in for(i = 0 ; i < N; i++) this is 'N') but it is the team-assigned block upper bound. Sema creates a new expression holding the calculation of the actual upper bound for 'for' as UB = min(UB, PrevUB), where UB is the loop upper bound, and PrevUB is the team-assigned block upper bound.
llvm-svn: 273884
http://reviews.llvm.org/D21564
This patch is an initial implementation for #distribute parallel for.
The main differences that affect other pragmas are:
The implementation of 'distribute parallel for' requires blocking of the associated loop, where blocks are "distributed" to different teams and iterations within each block are scheduled to parallel threads within each team. To implement blocking, sema creates two additional worksharing directive fields that are used to pass the team assigned block lower and upper bounds through the outlined function resulting from 'parallel'. In this way, scheduling for 'for' to threads can use those bounds.
As a consequence of blocking, the stride of 'distribute' is not 1 but it is equal to the blocking size. This is returned by the runtime and sema prepares a DistIncrExpr variable to hold that value.
As a consequence of blocking, the global upper bound (EnsureUpperBound) expression of the 'for' is not the original loop upper bound (e.g. in for(i = 0 ; i < N; i++) this is 'N') but it is the team-assigned block upper bound. Sema creates a new expression holding the calculation of the actual upper bound for 'for' as UB = min(UB, PrevUB), where UB is the loop upper bound, and PrevUB is the team-assigned block upper bound.
llvm-svn: 273705
-Wfor-loop-analysis warnings for a for-loop with a condition variable. In such
a case, the loop condition variable is modified on each iteration of the loop
by definition.
Original commit message:
Rearrange condition handling so that semantic checks on a condition variable
are performed before the other substatements of the construct are parsed,
rather than deferring them until the end. This allows better error recovery
from semantic errors in the condition, improves diagnostic order, and is a
prerequisite for C++17 constexpr if.
llvm-svn: 273600
During the core analysis, ExplodedNodes are added to the
ExplodedGraph, and those nodes are cached for deduplication purposes.
After core analysis, reports are generated. Here, trimmed copies of
the ExplodedGraph are made. Since the ExplodedGraph has already been
deduplicated, there is no need to deduplicate again.
This change makes it possible to add ExplodedNodes to an
ExplodedGraph without the overhead of deduplication. "Uncached" nodes
also cannot be iterated over, but none of the report generation code
attempts to iterate over all nodes. This change reduces the analysis
time of a large .C file from 3m43.941s to 3m40.256s (~1.6% speedup).
It should slightly reduce memory consumption. Gains should be roughly
proportional to the number (and path length) of static analysis
warnings.
This patch enables future work that should remove the need for an
InterExplodedGraphMap inverse map. I plan on using the (now unused)
ExplodedNode link to connect new nodes to the original nodes.
http://reviews.llvm.org/D21229
llvm-svn: 273572
The PIC and PIE levels are not independent. In fact, if PIE is defined
it is always the same as PIC.
This is clear in the driver where ParsePICArgs returns a PIC level and
a IsPIE boolean. Unfortunately that is currently lost and we pass two
redundant levels down the pipeline.
This patch keeps a bool and a PIC level all the way down to codegen.
llvm-svn: 273566
are performed before the other substatements of the construct are parsed,
rather than deferring them until the end. This allows better error recovery
from semantic errors in the condition, improves diagnostic order, and is a
prerequisite for C++17 constexpr if.
llvm-svn: 273548
Add support for /Ob1 (and equivalent -finline-hint-functions), which enable
inlining only for functions marked inline, either explicitly (via inline
keyword, for example), or implicitly (function definition in class body,
for example).
This works by enabling inlining pass, and adding noinline attribute to
every function not marked inline.
Patch by Rudy Pons <rudy.pons@ilod.org>!
Differential Revision: http://reviews.llvm.org/D20647
llvm-svn: 273440
This is a fix for PR28112:
https://llvm.org/bugs/show_bug.cgi?id=28112
The FP comparison intrinsics that take an immediate parameter (rather than specifying
a comparison predicate in the function name) were added with AVX; these are macros in
avxintrin.h. This patch makes clang behavior match gcc (error if a program tries to use
these without -mavx) and matches the Intel documentation, eg:
VCMPPS: m128 _mm_cmp_ps(m128 a, __m128 b, const int imm)
'V' means this is intended to only work with the AVX form of the instruction.
Differential Revision: http://reviews.llvm.org/D21306
llvm-svn: 273311
Summary:
Added calculateRangesAfterReplaments() to calculate original ranges as well as
newly affacted ranges in the new code.
Reviewers: klimek, djasper
Subscribers: cfe-commits, klimek
Differential Revision: http://reviews.llvm.org/D21547
llvm-svn: 273290
Include opencl-c.h by default as a module to utilize the automatic AST caching mechanism of clang modules.
Add an option -finclude-default-header to enable default header for OpenCL, which is off by default.
Differential Revision: http://reviews.llvm.org/D20444
llvm-svn: 273191
Add -mno-iamcu option to:
1) Countervail -miamcu option easily
2) Be compatible with GCC which supports this option
Differential Revision: http://reviews.llvm.org/D21469
llvm-svn: 273147
This mirrors the many other -i*after options to insert a new system search
directory at the end of the search path. This makes it possible to actually
inject a search path after the resource dir. This option is similar in spirit
to the /imsvc option in the clang-cl driver. This is needed to properly use the
driver for Windows targets where the clang headers wrap some of the system
headers.
This concept is actually useful on other targets (e.g. Linux) and would be
really easy to support on the core toolchain.
llvm-svn: 273016
Fix a regression which forbids using -std=cl|CL1.1|CL1.2|CL2.0 in driver.
Allow -std and -cl-std={cl|CL}{|1.1|1.2|2.0}.
Differential Revision: http://reviews.llvm.org/D20630
llvm-svn: 273015
Reapplying patch in r272777 which was reverted
because the llvm patch which added support
for generating the mcrr/mcrr2 instructions
from the intrinsic was causing an assertion
failure. This has now been fixed in llvm.
llvm-svn: 272983
This patch fixes a bug where we'd segfault (in some cases) if we saw a
variadic function with one or more pass_object_size arguments.
Differential Revision: http://reviews.llvm.org/D17462
llvm-svn: 272971
This is the second patch required to support compilation for Intel MCU target (e.g. Intel(R) Quark(TM) micro controller D 2000).
When IAMCU triple is used:
* Recognize and use IAMCU GCC toolchain
* Set up include paths
* Forbid C++
Differential Revision: http://reviews.llvm.org/D19274
llvm-svn: 272883
Parameters and non-static members of aggregates are still excluded,
and probably should remain that way.
Differential Revision: http://reviews.llvm.org/D19754
llvm-svn: 272859
When static variables are used in inline functions in header files anything that
uses that function ends up with a reference to the variable. Because
RecursiveASTVisitor uses the inline functions in LambdaCapture that use static
variables any AST plugin that uses RecursiveASTVisitor, such as the
PrintFunctionNames example, ends up with a reference to these variables. This is
bad on Windows when building with MSVC with LLVM_EXPORT_SYMBOLS_FOR_PLUGINS=ON
as variables used across a DLL boundary need to be explicitly dllimported in
the DLL using them.
This patch avoids that by adjusting LambdaCapture to be similar to before
r263921, with a capture of either 'this' or a VLA represented by a null Decl
pointer in DeclAndBits with an extra flag added to the bits to distinguish
between the two. This requires the use of an extra bit, and while Decl does
happen to be sufficiently aligned to allow this it's done in a way that means
PointerIntPair doesn't realise it and gives an assertion failure. Therefore I
also adjust Decl slightly to use LLVM_ALIGNAS to allow this.
Differential Revision: http://reviews.llvm.org/D20732
llvm-svn: 272788
Patch adds intrinsics for mrrc/mrrc2. The
intrinsics for mrrc/mrrc2 return a single
uint64_t to represent two 32 bit values.
The mcrr/mcrr2 intrinsic was changed to
accept a single uint64_t instead of two
32 bit values as the input for consistency.
Differential Revision: http://reviews.llvm.org/D21179
llvm-svn: 272777
classes.
MSVC actively uses unqualified lookup in dependent bases, lookup at the
instantiation point (non-dependent names may be resolved on things
declared later) etc. and all this stuff is the main cause of
incompatibility between clang and MSVC.
Clang tries to emulate MSVC behavior but it may fail in many cases.
clang could store lexed tokens for member functions definitions within
ClassTemplateDecl for later parsing during template instantiation.
It will allow resolving many possible issues with lookup in dependent
base classes and removing many already existing MSVC-specific
hacks/workarounds from the clang code.
llvm-svn: 272774
Summary:
This is similar to other loop pragmas like 'vectorize'. Currently it
only has state values: distribute(enable) and distribute(disable). When
one of these is specified the corresponding loop metadata is generated:
!{!"llvm.loop.distribute.enable", i1 true/false}
As a result, loop distribution will be attempted on the loop even if
Loop Distribution in not enabled globally. Analogously, with 'disable'
distribution can be turned off for an individual loop even when the pass
is otherwise enabled.
There are some slight differences compared to the existing loop pragmas.
1. There is no 'assume_safety' variant which makes its handling slightly
different from 'vectorize'/'interleave'.
2. Unlike the existing loop pragmas, it does not have a corresponding
numeric pragma like 'vectorize' -> 'vectorize_width'. So for the
consistency checks in CheckForIncompatibleAttributes we don't need to
check it against other pragmas. We just need to check for duplicates of
the same pragma.
Reviewers: rsmith, dexonsmith, aaron.ballman
Subscribers: bob.wilson, cfe-commits, hfinkel
Differential Revision: http://reviews.llvm.org/D19403
llvm-svn: 272656
Summary:
The validity of ABI/CPU pairs is no longer checked on the fly but is
instead checked after initialization. As a result, invalid CPU/ABI pairs
can be reported as being known but invalid instead of being unknown. For
example, we now emit:
error: ABI 'n32' is not supported on CPU 'mips32r2'
instead of:
error: unknown target ABI 'n64'
Reviewers: atanasyan
Subscribers: sdardis, cfe-commits
Differential Revision: http://reviews.llvm.org/D21023
llvm-svn: 272645
If definition of default function argument uses itself, clang crashed,
because corresponding function parameter is not associated with the default
argument yet. With this fix clang emits appropriate error message.
This change fixes PR28105.
Differential Revision: http://reviews.llvm.org/D21301
llvm-svn: 272623
Summary:
This patch introduces the concept of offloading tool chain and offloading kind. Each tool chain may have associated an offloading kind that marks it as used in a given programming model that requires offloading.
It also adds the logic to iterate on the tool chains based on the kind. Currently, only CUDA is supported, but in general a programming model (an offloading kind) may have associated multiple tool chains that require supporting offloading.
This patch does not add tests - its goal is to keep the existing functionality.
This patch is the first of a series of three that attempts to make the current support of CUDA more generic and easier to extend to other programming models, namely OpenMP. It tries to capture the suggestions/improvements/concerns on the initial proposal in http://lists.llvm.org/pipermail/cfe-dev/2016-February/047547.html. It only tackles the more consensual part of the proposal, i.e.does not address the problem of intermediate files bundling yet.
Reviewers: ABataev, jlebar, echristo, hfinkel, tra
Subscribers: guansong, Hahnfeld, andreybokhanko, tcramer, mkuron, cfe-commits, arpith-jacob, carlo.bertolli, caomhin
Differential Revision: http://reviews.llvm.org/D18170
llvm-svn: 272571
Differential Revision: http://reviews.llvm.org/D19843
Corresponding LLVM change: http://reviews.llvm.org/D19842
Re-commit after addressing issues with of generating too many warnings for Windows and asan test failures.
Patch by Eric Niebler
llvm-svn: 272562
This patch implements PR#22821.
Taking the address of a packed member is dangerous since the reduced
alignment of the pointee is lost. This can lead to memory alignment
faults in some architectures if the pointer value is dereferenced.
This change adds a new warning to clang emitted when taking the address
of a packed member. A packed member is either a field/data member
declared as attribute((packed)) or belonging to a struct/class
declared as such. The associated flag is -Waddress-of-packed-member
Differential Revision: http://reviews.llvm.org/D20561
llvm-svn: 272552
We can now use __builtin_nontemporal_store instead of target specific builtins for naturally aligned nontemporal stores which avoids the need for handling in CGBuiltin.cpp
The scalar integer nontemporal (unaligned) store builtins will have to wait as __builtin_nontemporal_store currently assumes natural alignment and doesn't accept the 'packed struct' trick that we use for normal unaligned load/stores.
The nontemporal loads require further backend support before we can safely convert them to __builtin_nontemporal_load
Differential Revision: http://reviews.llvm.org/D21272
llvm-svn: 272540
This commit adds a static analysis checker to verify the correct usage of the MPI API in C
and C++. This version updates the reverted r271981 to fix a memory corruption found by the
ASan bots.
Three path-sensitive checks are included:
- Double nonblocking: Double request usage by nonblocking calls without intermediate wait
- Missing wait: Nonblocking call without matching wait.
- Unmatched wait: Waiting for a request that was never used by a nonblocking call
Examples of how to use the checker can be found at https://github.com/0ax1/MPI-Checker
A patch by Alexander Droste!
Reviewers: zaks.anna, dcoughlin
Differential Revision: http://reviews.llvm.org/D21081
llvm-svn: 272529
Does a good job with type and non-type template arguments
and lays the groundwork for template template arguments to
visualize well once there is a TemplateName visualizer.
Also fixed what looks like an incorrect comment in the
header for ParsedTemplate.h.
llvm-svn: 272521
The bug report by Gonzalo (https://llvm.org/bugs/show_bug.cgi?id=27507 -- which results in clang crashing when generic lambdas that capture 'this' are instantiated in contexts where the Functionscopeinfo stack is not in a reliable state - yet getCurrentThisType expects it to be) - unearthed some additional bugs in regards to maintaining proper cv qualification through 'this' when performing by value captures of '*this'.
This patch attempts to correct those bugs and makes the following changes:
o) when capturing 'this', we do not need to remember the type of 'this' within the LambdaScopeInfo's Capture - it is never really used for a this capture - so remove it.
o) teach getCurrentThisType to walk the stack of lambdas (even in scenarios where we run out of LambdaScopeInfo's such as when instantiating call operators) looking for by copy captures of '*this' and resetting the type of 'this' based on the constness of that capturing lambda's call operator.
This patch has been baking in review-hell for > 6 weeks - all the comments so far have been addressed and the bug (that it addresses in passing, and I regret not submitting as a separate patch initially) has been reported twice independently, so is frequent and important for us not to just sit on. I merged the cv qualification-fix and the PR-fix initially in one patch, since they resulted from my initial implementation of star-this and so were related. If someone really feels strongly, I can put in the time to revert this - separate the two out - and recommit. I won't claim it's immunized against all bugs, but I feel confident enough about the fix to land it for now.
llvm-svn: 272480
GCC still permits enabling the SjLj EH model. This is something which can be
done on various targets. Hoist the -fsjlj-exceptions option into the driver and
pass it through. This allows one to opt into the alternative EH model while
retaining the default to be the target's default.
Resolves PR27749!
llvm-svn: 272424
Microsoft headers, comdef.h and comutil.h, assume that this is an OK
thing to do. Downgrade the hard error to a warning if we are in
-fms-extensions mode.
This fixes PR28080.
llvm-svn: 272412
Rehashing the ExplodedNode table is very expensive. The hashing
itself is expensive, and the general activity of iterating over the
hash table is highly cache unfriendly. Instead, we guess at the
eventual size by using the maximum number of steps allowed. This
generally avoids a rehash. It is possible that we still need to
rehash if the backlog of work that is added to the worklist
significantly exceeds the number of work items that we process. Even
if we do need to rehash in that scenario, this change is still a
win, as we still have fewer rehashes that we would have prior to
this change.
For small work loads, this will increase the memory used. For large
work loads, it will somewhat reduce the memory used. Speed is
significantly increased. A large .C file took 3m53.812s to analyze
prior to this change. Now it takes 3m38.976s, for a ~6% improvement.
http://reviews.llvm.org/D20933
llvm-svn: 272394
Summary:
Create a new Frontend LangOpt to specify the renderscript language. It
is enabled by the "-x renderscript" option from the driver.
Add a "kernel" function attribute only for RenderScript (an "ignored
attribute" warning is generated otherwise).
Make the NativeHalfType and NativeHalfArgsAndReturns LangOpts be implied
by the RenderScript LangOpt.
Reviewers: rsmith
Subscribers: cfe-commits, srhines
Differential Revision: http://reviews.llvm.org/D21198
llvm-svn: 272342
Summary:
Add RenderScript language type and associate it with ".rs" extensions.
Test that the driver passes "-x renderscript" to the frontend for ".rs"
files.
(Also add '.rs' to the list of suffixes tested by lit).
Reviewers: rsmith
Subscribers: cfe-commits, srhines
Differential Revision: http://reviews.llvm.org/D21199
llvm-svn: 272317
Summary: Clang changes to make use of the LLVM intrinsics added in D21160.
Reviewers: tra
Subscribers: jholewinski, cfe-commits
Differential Revision: http://reviews.llvm.org/D21162
llvm-svn: 272299
These ExprWithCleanups are added for holding a RunCleanupsScope not
for destructor calls; rather, they are for lifetime marks. This requires
ExprWithCleanups to keep a bit to indicate whether it have cleanups with
side effects (e.g. dtor calls).
Differential Revision: http://reviews.llvm.org/D20498
llvm-svn: 272296
Given the following C++:
```
void foo();
void foo() __attribute__((enable_if(false, "")));
bool bar() {
auto P = foo;
return P == foo;
}
```
We'll currently happily (and correctly) resolve `foo` to the `foo`
overload without `enable_if` when assigning to `P`. However, we'll
complain about an ambiguous overload on the `P == foo` line, because
`Sema::CheckPlaceholderExpr` doesn't recognize that there's only one
`foo` that could possibly work here.
This patch teaches `Sema::CheckPlaceholderExpr` how to properly deal
with such cases.
Grepping for other callers of things like
`Sema::ResolveAndFixSingleFunctionTemplateSpecialization`, it *looks*
like this is the last place that needed to be fixed up. If I'm wrong,
I'll see if there's something we can do that beats what amounts to
whack-a-mole with bugs.
llvm-svn: 272080
Second try at reapplying
"[analyzer] Add checker for correct usage of MPI API in C and C++."
Special thanks to Dan Liew for helping test the fix for the template
specialization compiler error with gcc.
The original patch is by Alexander Droste!
Differential Revision: http://reviews.llvm.org/D12761
llvm-svn: 271977
Reapply r271907 with a fix for the compiler error with gcc about specializing
clang::ento::ProgramStateTrait in a different namespace.
Differential Revision: http://reviews.llvm.org/D12761
llvm-svn: 271914
This commit adds a static analysis checker to check for the correct usage of the
MPI API in C and C++.
3 path-sensitive checks are included:
- Double nonblocking: Double request usage by nonblocking calls
without intermediate wait.
- Missing wait: Nonblocking call without matching wait.
- Unmatched wait: Waiting for a request that was never used by a
nonblocking call.
Examples of how to use the checker can be found
at https://github.com/0ax1/MPI-Checker
Reviewers: zaks.anna
A patch by Alexander Droste!
Differential Revision: http://reviews.llvm.org/D12761
llvm-svn: 271907
We now have a cmake option to change the default: ENABLE_LINKER_BUILD_ID.
The reason is that build-id is fairly expensive, so we shouldn't impose
it in the regular edit/build cycle.
This is similar to gcc, that has an off by default --enable-linker-build-id
option.
llvm-svn: 271692
The 'cvtt' truncation (round to zero) conversions can be safely represented as generic __builtin_convertvector (fptosi) calls instead of x86 intrinsics. We already do this (implicitly) for the scalar equivalents.
Note: I looked at updating _mm_cvttpd_epi32 as well but this still requires a lot more backend work to correctly lower (both for debug and optimized builds).
Differential Revision: http://reviews.llvm.org/D20859
llvm-svn: 271436
Summary:
When a replacement's offset is set to UINT_MAX or -1U, it is treated as
a header insertion replacement by cleanupAroundReplacements(). The new #include
directive is then inserted into the correct block.
Reviewers: klimek, djasper
Subscribers: klimek, cfe-commits, bkramer
Differential Revision: http://reviews.llvm.org/D20734
llvm-svn: 271276
The VPMOVSX and (V)PMOVZX sign/zero extension intrinsics can be safely represented as generic __builtin_convertvector calls instead of x86 intrinsics.
This patch removes the clang builtins and their use in the sse2/avx headers - a companion patch will remove/auto-upgrade the llvm intrinsics.
Note: We already did this for SSE41 PMOVSX sometime ago.
Differential Revision: http://reviews.llvm.org/D20684
llvm-svn: 271106
Diagnostics that happen during driver time do not have color output support
unless -fcolor-diagonostic is explicitly passed into the driver. This is not a
problem for cc1 since dianostic arguments are properly handled and color is
enabled by default if the terminal supports it.
Make the driver behave like CC1. There are tests that already check for these
flags, but for the color itself there's no sensible way to test it.
Differential Revision: http://reviews.llvm.org/D20404
rdar://problem/26290980
llvm-svn: 271042
This patch adds the commandline option -mcompact-branches={never,optimal,always),
which controls how LLVM generates compact branches for MIPSR6 targets. By default,
the compact branch policy is 'optimal' where LLVM will generate the most
appropriate branch for any situation. The 'never' and 'always' policy will disable
or always generate compact branches wherever possible respectfully.
Reviewers: dsanders, vkalintiris, atanasyan
Differential Revision: http://reviews.llvm.org/D20729
llvm-svn: 271000
Summary:
The patch contains the parsing and sema support for the `from` clause.
Patch based on the original post by Kelvin Li.
Reviewers: hfinkel, carlo.bertolli, kkwli0, arpith-jacob, ABataev
Subscribers: caomhin, cfe-commits
Differential Revision: http://reviews.llvm.org/D18488
llvm-svn: 270882
Summary:
The patch contains the parsing and sema support for the `to` clause.
Patch based on the original post by Kelvin Li.
Reviewers: carlo.bertolli, hfinkel, kkwli0, arpith-jacob, ABataev
Subscribers: caomhin, cfe-commits
Differential Revision: http://reviews.llvm.org/D18597
llvm-svn: 270880
Summary:
This patch is to add parsing and sema support for `target update` directive. Support for the `to` and `from` clauses will be added by a different patch. This patch also adds support for other clauses that are already implemented upstream and apply to `target update`, e.g. `device` and `if`.
This patch is based on the original post by Kelvin Li.
Reviewers: hfinkel, carlo.bertolli, kkwli0, arpith-jacob, ABataev
Subscribers: caomhin, cfe-commits
Differential Revision: http://reviews.llvm.org/D15944
llvm-svn: 270878
OpenMP version.
If '-fopenmp' option is provided '-fopenmp-version=' allows to control,
which version of OpenMP must be supported. Currently it affects only the
value of _OPENMP define.
llvm-svn: 270838
This implements support for MS-specific __unaligned qualifier in functions and
makes the following test case both compile and mangle correctly:
struct S {
void f() __unaligned;
};
void S::f() __unaligned {
}
Differential Revision: http://reviews.llvm.org/D20437
llvm-svn: 270834
objective-c properties.
This fixes an assert in CodeGen that fires when the getter and setter
functions for an objective-c property of type _Atomic(_Bool) are
synthesized.
rdar://problem/26322972
Differential Revision: http://reviews.llvm.org/D20407
llvm-svn: 270808
If the callee has a valid location (not all do), then use that. Otherwise, fall
back to the starting location. This makes sure that the debug info for calls
points to the call (not the start of the expression providing the object on
which the member function is being called).
For example, given this:
f->foo()->bar();
we don't want both calls to point to the 'f', but rather to the 'foo()' and
the 'bar()'.
Fixes PR27567.
Differential Revision: http://reviews.llvm.org/D19708
llvm-svn: 270775
If we have some function with dllimport attribute and then we have the function
definition in the same module but without dllimport attribute we should add
dllexport attribute to this function definition.
The same should be done for variables.
Example:
struct __declspec(dllimport) C3 {
~C3();
};
C3::~C3() {;} // we should export this definition.
Patch by Andrew V. Tischenko
Differential revision: http://reviews.llvm.org/D18953
llvm-svn: 270686
Summary:
Adds a new -fsanitize=efficiency-working-set flag to enable esan's working
set tool. Adds appropriate tests for the new flag.
Reviewers: aizatsky
Subscribers: vitalybuka, zhaoqin, kcc, eugenis, llvm-commits
Differential Revision: http://reviews.llvm.org/D20484
llvm-svn: 270641
Summary:
In dependent contexts where we know a type name is required, such as a
new expression, we can recover by forming a DependentNameType.
This generalizes our existing compatibility hack for default arguments
for template type parameters.
Works towards parsing atlctrlw.h, which is PR26748.
Reviewers: avt77, rsmith
Subscribers: cfe-commits
Differential Revision: http://reviews.llvm.org/D20500
llvm-svn: 270615
-finline-functions and /Ob2 are currently ignored by Clang. The only way to
enable inlining is to use the global O flags, which also enable other options,
or to emit LLVM bitcode using Clang, then running opt by hand with the inline
pass.
This patch allows to simply use the -finline-functions flag (same as GCC) or
/Ob2 in clang-cl mode to enable inlining without other optimizations.
This is the first patch of a serie to improve support for the /Ob flags.
Patch by Rudy Pons <rudy.pons@ilod.org>!
Differential Revision: http://reviews.llvm.org/D20576
llvm-svn: 270609
The statement constructed an anonymous structure which was typedefed. The
anonymous structure has internal linkage, and that would cause an error when
building with modules. Give the type declaration a tag name to address the
error when building with modules.
llvm-svn: 270520
Both the (V)CVTDQ2PD(Y) (i32 to f64) and (V)CVTPS2PD(Y) (f32 to f64) conversion instructions are lossless and can be safely represented as generic __builtin_convertvector calls instead of x86 intrinsics without affecting final codegen.
This patch removes the clang builtins and their use in the sse2/avx headers - a future patch will deal with removing the llvm intrinsics, but that will require a bit more work.
Differential Revision: http://reviews.llvm.org/D20528
llvm-svn: 270499
Add two tests which show our error handling behavior for invalid
parameters in the layout_version and empty_bases attributes.
Amend our documentation to make it more clear that
__declspec(empty_bases) and __declspec(layout_version) can only apply to
classes, structs, and unions.
llvm-svn: 270461
MSVC now supports the __is_assignable type trait intrinsic,
to enable easier and more efficient implementation of the
Standard Library's is_assignable trait.
As of Visual Studio 2015 Update 3, the VC Standard Library
implementation uses the new intrinsic unconditionally.
The implementation is pretty straightforward due to the previously
existing is_nothrow_assignable and is_trivially_assignable.
We handle __is_assignable via the same code as the other two except
that we skip the extra checks for nothrow or triviality.
Patch by Dave Bartolomeo!
Differential Revision: http://reviews.llvm.org/D20492
llvm-svn: 270458
The layout_version attribute is pretty straightforward: use the layout
rules from version XYZ of MSVC when used like
struct __declspec(layout_version(XYZ)) S {};
The empty_bases attribute is more interesting. It tries to get the C++
empty base optimization to fire more often by tweaking the MSVC ABI
rules in subtle ways:
1. Disable the leading and trailing zero-sized object flags if a class
is marked __declspec(empty_bases) and is empty.
This means that given:
struct __declspec(empty_bases) A {};
struct __declspec(empty_bases) B {};
struct C : A, B {};
'C' will have size 1 and nvsize 0 despite not being annotated
__declspec(empty_bases).
2. When laying out virtual or non-virtual bases, disable the injection
of padding between classes if the most derived class is marked
__declspec(empty_bases).
This means that given:
struct A {};
struct B {};
struct __declspec(empty_bases) C : A, B {};
'C' will have size 1 and nvsize 0.
3. When calculating the offset of a non-virtual base, choose offset zero
if the most derived class is marked __declspec(empty_bases) and the
base is empty _and_ has an nvsize of 0.
Because of the ABI rules, this does not mean that empty bases
reliably get placed at offset 0!
For example:
struct A {};
struct B {};
struct __declspec(empty_bases) C : A, B { virtual ~C(); };
'C' will be pointer sized to account for the vfptr at offset 0.
'A' and 'B' will _not_ be at offset 0 despite being empty!
Instead, they will be located right after the vfptr.
This occurs due to the interaction betweeen non-virtual base layout
and virtual function pointer injection: injection occurs after the
nv-bases and shifts them down by the size of a pointer.
llvm-svn: 270457
OpenCL builtin functions to_{global|local|private} accepts argument of pointer type to arbitrary pointee type, and return a pointer to the same pointee type in different addr space, i.e.
global gentype *to_global(gentype *p);
It is not desirable to declare it as
global void *to_global(void *);
in opencl header file since it misses diagnostics.
This patch implements these builtin functions as Clang builtin functions. In the builtin def file they are defined to have signature void*(void*). When handling call expressions, their declarations are re-written to have correct parameter type and return type corresponding to the call argument.
In codegen call to addr void *to_addr(void*) is generated with addrcasts or bitcasts to facilitate implementation in builtin library.
Differential Revision: http://reviews.llvm.org/D19932
llvm-svn: 270261
Summary:
This change automatically sorts ES6 imports and exports into four groups:
absolute imports, parent imports, relative imports, and then exports. Exports
are sorted in the same order, but not grouped further.
To keep JS import sorting out of Format.cpp, this required extracting the
TokenAnalyzer infrastructure to separate header and implementation files.
Reviewers: djasper
Subscribers: cfe-commits, klimek
Differential Revision: http://reviews.llvm.org/D20198
llvm-svn: 270203
Summary:
Previously it was implemented as inline asm in the CUDA headers.
This change allows us to use the [addr+imm] addressing mode when
executing ld.global.nc instructions. This translates into a 1.3x
speedup on some benchmarks that call this instruction from within an
unrolled loop.
Reviewers: tra, rsmith
Subscribers: jhen, cfe-commits, jholewinski
Differential Revision: http://reviews.llvm.org/D19990
llvm-svn: 270150
According to Cuda Programming guide (v7.5, E2.3.1):
> __device__, __constant__ and __shared__ variables defined in namespace
> scope, that are of class type, cannot have a non-empty constructor or a
> non-empty destructor.
Clang already deals with device-side constructors (see D15305).
This patch enforces similar rules for destructors.
Differential Revision: http://reviews.llvm.org/D20140
llvm-svn: 270108
All additional include directories are relative to the toolchain install
folder. So let's do not pass this folder to each callback to simplify
and slightly reduce the code.
llvm-svn: 270069
- Fixed cdp intrinsic to only accept compile time
constant values previously you could pass in a
variable to the builtin which would result in
illegal llvm assembly output
Differential Revision: http://reviews.llvm.org/D20394
llvm-svn: 270058
an identifier table lookup, *and* copy the LangOptions (including various
std::vector<std::string>s). Twice. We call this function once each time we start
parsing a declaration specifier sequence, and once for each call to Sema::Diag.
This reduces the compile time for a sample .c file from the linux kernel by 20%.
llvm-svn: 270009