Summary:
Following patch D19265 which enable software floating point support in the Sparc backend, this patch enables the option to be enabled in the front-end using the -msoft-float option.
The user should ensure a library (such as the builtins from Compiler-RT) that includes the software floating point routines is provided.
Reviewers: jyknight, lero_chris
Subscribers: jyknight, cfe-commits
Differential Revision: http://reviews.llvm.org/D20419
llvm-svn: 270538
directives.
If firstprivate variable is is captured by value in outlined region and then used as firstprivate variable in inner worksharing directive, the copy for this firstprivate variable was not created. Fixed this bug.
llvm-svn: 270536
Clang doesn't dllexport defaulted special member function defaulted
inside class but does it if they defaulted outside class. MSVC doesn't
make any distinction where they were defaulted. Also MSVC 2013 and 2015
export different set of members. MSVC2015 doesn't emit trivial defaulted
x-tors but does emit copy assign operator.
Differential revision: http://reviews.llvm.org/D20422
llvm-svn: 270535
Thread through -fsjlj-exceptions to the backend via the TargetOptions. This is
in preparation for supporting SjLj exceptions on x86 (e.g. for MinGW).
llvm-svn: 270528
The statement constructed an anonymous structure which was typedefed. The
anonymous structure has internal linkage, and that would cause an error when
building with modules. Give the type declaration a tag name to address the
error when building with modules.
llvm-svn: 270520
Both the (V)CVTDQ2PD(Y) (i32 to f64) and (V)CVTPS2PD(Y) (f32 to f64) conversion instructions are lossless and can be safely represented as generic __builtin_convertvector calls instead of x86 intrinsics without affecting final codegen.
This patch removes the clang builtins and their use in the sse2/avx headers - a future patch will deal with removing the llvm intrinsics, but that will require a bit more work.
Differential Revision: http://reviews.llvm.org/D20528
llvm-svn: 270499
Add two tests which show our error handling behavior for invalid
parameters in the layout_version and empty_bases attributes.
Amend our documentation to make it more clear that
__declspec(empty_bases) and __declspec(layout_version) can only apply to
classes, structs, and unions.
llvm-svn: 270461
MSVC now supports the __is_assignable type trait intrinsic,
to enable easier and more efficient implementation of the
Standard Library's is_assignable trait.
As of Visual Studio 2015 Update 3, the VC Standard Library
implementation uses the new intrinsic unconditionally.
The implementation is pretty straightforward due to the previously
existing is_nothrow_assignable and is_trivially_assignable.
We handle __is_assignable via the same code as the other two except
that we skip the extra checks for nothrow or triviality.
Patch by Dave Bartolomeo!
Differential Revision: http://reviews.llvm.org/D20492
llvm-svn: 270458
The layout_version attribute is pretty straightforward: use the layout
rules from version XYZ of MSVC when used like
struct __declspec(layout_version(XYZ)) S {};
The empty_bases attribute is more interesting. It tries to get the C++
empty base optimization to fire more often by tweaking the MSVC ABI
rules in subtle ways:
1. Disable the leading and trailing zero-sized object flags if a class
is marked __declspec(empty_bases) and is empty.
This means that given:
struct __declspec(empty_bases) A {};
struct __declspec(empty_bases) B {};
struct C : A, B {};
'C' will have size 1 and nvsize 0 despite not being annotated
__declspec(empty_bases).
2. When laying out virtual or non-virtual bases, disable the injection
of padding between classes if the most derived class is marked
__declspec(empty_bases).
This means that given:
struct A {};
struct B {};
struct __declspec(empty_bases) C : A, B {};
'C' will have size 1 and nvsize 0.
3. When calculating the offset of a non-virtual base, choose offset zero
if the most derived class is marked __declspec(empty_bases) and the
base is empty _and_ has an nvsize of 0.
Because of the ABI rules, this does not mean that empty bases
reliably get placed at offset 0!
For example:
struct A {};
struct B {};
struct __declspec(empty_bases) C : A, B { virtual ~C(); };
'C' will be pointer sized to account for the vfptr at offset 0.
'A' and 'B' will _not_ be at offset 0 despite being empty!
Instead, they will be located right after the vfptr.
This occurs due to the interaction betweeen non-virtual base layout
and virtual function pointer injection: injection occurs after the
nv-bases and shifts them down by the size of a pointer.
llvm-svn: 270457
Exherbo has an alternative file system layout to accommodate multiarch. The
loader is located at /usr/${triple}/lib/${loader}. Adjust the Linux toolchain
to support that on exherbo.
llvm-svn: 270392
The parameter already requires the toolchain, sink the method into the class.
This also enables the use of the distro detection logic which will be needed to
support Exherbo's multiarch approach.
llvm-svn: 270352
Convert the cascading if/else to a switch. This makes it easier to follow the
logic. Minor difference is that we no longer default to x86_64 but rather to
the architecture specified by the architecture.
llvm-svn: 270351
Ensure _mm256_extract_epi8 and _mm256_extract_epi16 zero extend their i8/i16 result to i32. This matches _mm_extract_epi8 and _mm_extract_epi16.
Fix for PR27594
Differential Revision: http://reviews.llvm.org/D20468
llvm-svn: 270330
OpenCL builtin functions to_{global|local|private} accepts argument of pointer type to arbitrary pointee type, and return a pointer to the same pointee type in different addr space, i.e.
global gentype *to_global(gentype *p);
It is not desirable to declare it as
global void *to_global(void *);
in opencl header file since it misses diagnostics.
This patch implements these builtin functions as Clang builtin functions. In the builtin def file they are defined to have signature void*(void*). When handling call expressions, their declarations are re-written to have correct parameter type and return type corresponding to the call argument.
In codegen call to addr void *to_addr(void*) is generated with addrcasts or bitcasts to facilitate implementation in builtin library.
Differential Revision: http://reviews.llvm.org/D19932
llvm-svn: 270261
The `FreeBSDTargetInfo` class has always set the `__FreeBSD_cc_version`
predefined macro to a rather static value, calculated from the major OS
version.
In the FreeBSD base system, we will start incrementing the value of this
macro whenever we make any signifant change to clang, so we need a way
to configure the macro's value at build time.
Use `FREEBSD_CC_VERSION` for this, which we can define in the FreeBSD
build system using either the `-D` command line option, or an include
file. Stock builds will keep the earlier value.
Differential Revision: http://reviews.llvm.org/D20037
llvm-svn: 270240
This is matching what trunk gcc is accepting. Also adds a missing ssse3
case. PR27779. The amount of duplication here is annoying, maybe it
should be factored into a separate .def file?
llvm-svn: 270224
Summary:
This change automatically sorts ES6 imports and exports into four groups:
absolute imports, parent imports, relative imports, and then exports. Exports
are sorted in the same order, but not grouped further.
To keep JS import sorting out of Format.cpp, this required extracting the
TokenAnalyzer infrastructure to separate header and implementation files.
Reviewers: djasper
Subscribers: cfe-commits, klimek
Differential Revision: http://reviews.llvm.org/D20198
llvm-svn: 270203
Some people have weird CI systems that run each test subdirectory
independently without access to other parallel trees.
Unfortunately, this means we have to suffer some duplication until Art
can sort out how to share these types.
llvm-svn: 270164
The lexer sets the end location of macro arguments incorrectly *if*,
while merging consecutive args to fit into a single SLocEntry, it finds
args which come from different macro files.
Fix the issue by using separate SLocEntries in this situation.
This fixes a code coverage crasher (rdar://problem/26181005). Because
the lexer reported end locations for certain macro args incorrectly, we
would generate bogus coverage mappings with negative line offsets.
Reviewed-by: akyrtzi
Differential Revision: http://reviews.llvm.org/D20401
llvm-svn: 270160
The function strcmp() can return any value, not just {-1,0,1} : "The strcmp(const char *s1, const char *s2) function returns an integer greater than, equal to, or less than zero, accordingly as the string pointed to by s1 is greater than, equal to, or less than the string pointed to by s2." [C11 7.24.4.2p3]
https://llvm.org/bugs/show_bug.cgi?id=23790http://reviews.llvm.org/D16317
llvm-svn: 270154
Summary:
Previously it was implemented as inline asm in the CUDA headers.
This change allows us to use the [addr+imm] addressing mode when
executing ld.global.nc instructions. This translates into a 1.3x
speedup on some benchmarks that call this instruction from within an
unrolled loop.
Reviewers: tra, rsmith
Subscribers: jhen, cfe-commits, jholewinski
Differential Revision: http://reviews.llvm.org/D19990
llvm-svn: 270150
According to Cuda Programming guide (v7.5, E2.3.1):
> __device__, __constant__ and __shared__ variables defined in namespace
> scope, that are of class type, cannot have a non-empty constructor or a
> non-empty destructor.
Clang already deals with device-side constructors (see D15305).
This patch enforces similar rules for destructors.
Differential Revision: http://reviews.llvm.org/D20140
llvm-svn: 270108
Codegen tests for device-side variable initialization are subset of test
cases used to verify Sema's part of the job.
Including CodeGenCUDA/device-var-init.cu from SemaCUDA makes it easier to
keep both sides in sync.
Differential Revision: http://reviews.llvm.org/D20139
llvm-svn: 270107
This matches default nvcc behavior and gives substantial
performance boost on GPU where fmad is much cheaper compared to add+mul.
Differential Revision: http://reviews.llvm.org/D20341
llvm-svn: 270094
All additional include directories are relative to the toolchain install
folder. So let's do not pass this folder to each callback to simplify
and slightly reduce the code.
llvm-svn: 270069
CodeSourcery toolchain is a standalone toolchain which always uses
the same triple name in its paths. It is independent from target
triple used by the driver.
llvm-svn: 270067
- Fixed cdp intrinsic to only accept compile time
constant values previously you could pass in a
variable to the builtin which would result in
illegal llvm assembly output
Differential Revision: http://reviews.llvm.org/D20394
llvm-svn: 270058
This reversal is being done with r267453's author's (i.e. Richard Smith's) permission.
This fixes https://llvm.org/bugs/show_bug.cgi?id=27601
Also, per Richard's request the examples from the bug report have been added to our test suite.
llvm-svn: 270016
an identifier table lookup, *and* copy the LangOptions (including various
std::vector<std::string>s). Twice. We call this function once each time we start
parsing a declaration specifier sequence, and once for each call to Sema::Diag.
This reduces the compile time for a sample .c file from the linux kernel by 20%.
llvm-svn: 270009
instance method.
When diagnosing unimplemented class property, make sure we emit
a warning when we only see an instance method with the right selector.
Also warn when we only see a class method for an instance property.
rdar://26141719
llvm-svn: 269968
Summary:
-fembed-bitcode was only checking for old style LTO flag (-flto) but not
considering the new -flto= style option. That makes clang output bitcode
embedded in bitcode object when using -flto= and -fembed-bitcode= together.
Now clang should output normal bitcode file when using LTO and ignores
-fembed-bitcode option.
Reviewers: joker.eph
Subscribers: joker.eph, cfe-commits
Differential Revision: http://reviews.llvm.org/D20374
llvm-svn: 269961
Summary:
MONITORX/MWAITX instructions provide similar capability to the MONITOR/MWAIT
pair while adding a timer function, such that another termination of the MWAITX
instruction occurs when the timer expires. The presence of the MONITORX and
MWAITX instructions is indicated by CPUID 8000_0001, ECX, bit 29.
The MONITORX and MWAITX instructions are intercepted by the same bits that
intercept MONITOR and MWAIT. MONITORX instruction establishes a range to be
monitored. MWAITX instruction causes the processor to stop instruction
execution and enter an implementation-dependent optimized state until
occurrence of a class of events.
Opcode of MONITORX instruction is "0F 01 FA". Opcode of MWAITX instruction is
"0F 01 FB". These opcode information is used in adding tests for the
disassembler.
These instructions are enabled for AMD's bdver4 architecture.
Patch by Ganesh Gopalasubramanian!
Reviewers: echristo, craig.topper
Subscribers: RKSimon, joker.eph, llvm-commits, cfe-commits
Differential Revision: http://reviews.llvm.org/D19796
llvm-svn: 269907