Previously, clang/llvm treated inline-asm instructions conservatively,
choosing not to eliminate the instructions or hoisting them out of a loop
even when it was safe to do so. This commit makes changes to attach a
readonly or readnone attribute to an inline-asm instruction, which enables
passes such as LICM and EarlyCSE to move or optimize away the instruction.
rdar://problem/11358192
Differential Revision: http://reviews.llvm.org/D10546
llvm-svn: 241930
tools/clang/test/CodeGen/packed-nest-unpacked.c contains this test:
struct XBitfield {
unsigned b1 : 10;
unsigned b2 : 12;
unsigned b3 : 10;
};
struct YBitfield {
char x;
struct XBitfield y;
} __attribute((packed));
struct YBitfield gbitfield;
unsigned test7() {
// CHECK: @test7
// CHECK: load i32, i32* getelementptr inbounds (%struct.YBitfield, %struct.YBitfield* @gbitfield, i32 0, i32 1, i32 0), align 4
return gbitfield.y.b2;
}
The "align 4" is actually wrong. Accessing all of "gbitfield.y" as a single
i32 is of course possible, but that still doesn't make it 4-byte aligned as
it remains packed at offset 1 in the surrounding gbitfield object.
This alignment was changed by commit r169489, which also introduced changes
to bitfield access code in CGExpr.cpp. Code before that change used to take
into account *both* the alignment of the field to be accessed within the
current struct, *and* the alignment of that outer struct itself; this logic
was removed by the above commit.
Neglecting to consider both values can cause incorrect code to be generated
(I've seen an unaligned access crash on SystemZ due to this bug).
In order to always use the best known alignment value, this patch removes
the CGBitFieldInfo::StorageAlignment member and replaces it with a
StorageOffset member specifying the offset from the start of the surrounding
struct to the bitfield's underlying storage. This offset can then be combined
with the best-known alignment for a bitfield access lvalue to determine the
alignment to use when accessing the bitfield's storage.
Differential Revision: http://reviews.llvm.org/D11034
llvm-svn: 241916
This patch corresponds to review:
http://reviews.llvm.org/D10972
Fix for the handling of dependent features that are enabled by default
on some CPU's (such as -mvsx, -mpower8-vector).
Also provides a number of new interfaces or fixes existing ones in
altivec.h.
Changed signatures to conform to ABI:
vector short vec_perm(vector signed short, vector signed short, vector unsigned char)
vector int vec_perm(vector signed int, vector signed int, vector unsigned char)
vector long long vec_perm(vector signed long long, vector signed long long, vector unsigned char)
vector signed char vec_sld(vector signed char, vector signed char, const int)
vector unsigned char vec_sld(vector unsigned char, vector unsigned char, const int)
vector bool char vec_sld(vector bool char, vector bool char, const int)
vector unsigned short vec_sld(vector unsigned short, vector unsigned short, const int)
vector signed short vec_sld(vector signed short, vector signed short, const int)
vector signed int vec_sld(vector signed int, vector signed int, const int)
vector unsigned int vec_sld(vector unsigned int, vector unsigned int, const int)
vector float vec_sld(vector float, vector float, const int)
vector signed char vec_splat(vector signed char, const int)
vector unsigned char vec_splat(vector unsigned char, const int)
vector bool char vec_splat(vector bool char, const int)
vector signed short vec_splat(vector signed short, const int)
vector unsigned short vec_splat(vector unsigned short, const int)
vector bool short vec_splat(vector bool short, const int)
vector pixel vec_splat(vector pixel, const int)
vector signed int vec_splat(vector signed int, const int)
vector unsigned int vec_splat(vector unsigned int, const int)
vector bool int vec_splat(vector bool int, const int)
vector float vec_splat(vector float, const int)
Added a VSX path to:
vector float vec_round(vector float)
Added interfaces:
vector signed char vec_eqv(vector signed char, vector signed char)
vector signed char vec_eqv(vector bool char, vector signed char)
vector signed char vec_eqv(vector signed char, vector bool char)
vector unsigned char vec_eqv(vector unsigned char, vector unsigned char)
vector unsigned char vec_eqv(vector bool char, vector unsigned char)
vector unsigned char vec_eqv(vector unsigned char, vector bool char)
vector signed short vec_eqv(vector signed short, vector signed short)
vector signed short vec_eqv(vector bool short, vector signed short)
vector signed short vec_eqv(vector signed short, vector bool short)
vector unsigned short vec_eqv(vector unsigned short, vector unsigned short)
vector unsigned short vec_eqv(vector bool short, vector unsigned short)
vector unsigned short vec_eqv(vector unsigned short, vector bool short)
vector signed int vec_eqv(vector signed int, vector signed int)
vector signed int vec_eqv(vector bool int, vector signed int)
vector signed int vec_eqv(vector signed int, vector bool int)
vector unsigned int vec_eqv(vector unsigned int, vector unsigned int)
vector unsigned int vec_eqv(vector bool int, vector unsigned int)
vector unsigned int vec_eqv(vector unsigned int, vector bool int)
vector signed long long vec_eqv(vector signed long long, vector signed long long)
vector signed long long vec_eqv(vector bool long long, vector signed long long)
vector signed long long vec_eqv(vector signed long long, vector bool long long)
vector unsigned long long vec_eqv(vector unsigned long long, vector unsigned long long)
vector unsigned long long vec_eqv(vector bool long long, vector unsigned long long)
vector unsigned long long vec_eqv(vector unsigned long long, vector bool long long)
vector float vec_eqv(vector float, vector float)
vector float vec_eqv(vector bool int, vector float)
vector float vec_eqv(vector float, vector bool int)
vector double vec_eqv(vector double, vector double)
vector double vec_eqv(vector bool long long, vector double)
vector double vec_eqv(vector double, vector bool long long)
vector bool long long vec_perm(vector bool long long, vector bool long long, vector unsigned char)
vector double vec_round(vector double)
vector double vec_splat(vector double, const int)
vector bool long long vec_splat(vector bool long long, const int)
vector signed long long vec_splat(vector signed long long, const int)
vector unsigned long long vec_splat(vector unsigned long long,
vector bool int vec_sld(vector bool int, vector bool int, const int)
vector bool short vec_sld(vector bool short, vector bool short, const int)
llvm-svn: 241904
Code in CGCall.cpp that loads up function arguments that need to be
coerced to a different type may in some cases ignore the fact that
the source of the argument is not naturally aligned. This may cause
incorrect code to be generated. In some places in CreateCoercedLoad,
we already have setAlignment calls to address this, but I ran into one
where it was missing, causing wrong code generation on SystemZ.
However, in that location, we do not actually know what alignment of
the source location we can rely on; the callers do not pass anything
to this routine. This is already an issue in other places in
CreateCoercedLoad; and the same problem exists for CreateCoercedStore.
To avoid pessimising code, and to fix the FIXMEs already in place,
this patch also adds an alignment argument to the CreateCoerced*
routines and uses it instead of forcing an alignment of 1. The
callers are changed to pass in the best information they have.
This actually requires changes in a number of existing test cases
since we now get better alignment in many places.
Differential Revision: http://reviews.llvm.org/D11033
llvm-svn: 241898
Move the diagnostic back to codegen so that we can compile ATL on the
self-host bot. We don't actually end up emitting code for the __try, so
the diagnostic won't be hit.
llvm-svn: 241761
This patch adds ObjectFilePCHContainerOperations uses the LLVM backend
to put the contents of a PCH into a __clangast section inside a COFF, ELF,
or Mach-O object file container.
This is done to facilitate module debugging by makeing it possible to
store the debug info for the types defined by a module alongside the AST.
rdar://problem/20091852
llvm-svn: 241620
"-arm-long-calls".
This change allows using -mlong-calls/-mno-long-calls for LTO and enabling or
disabling long call on a per-function basis.
rdar://problem/21529937
Differential Revision: http://reviews.llvm.org/D9414
llvm-svn: 241565
different function signatures. (Previously clang would emit all block
pointer types with the type of the first block pointer in the compile
unit.)
rdar://problem/21602473
llvm-svn: 241534
This reverts commit r241244, but restricts SEH support to Win64.
This way, Chromium builds will still fall back on TUs with SEH, and
Clang developers can work on this incrementally upstream while patching
this small predicate locally. It'll also make it easier to review small
fixes.
llvm-svn: 241533
The patch is the same except for the addition of a new test for the
issue that required reverting the dependent llvm commit.
--Original Commit Message--
Pass down the -flto option to the -cc1 job, and from there into the
CodeGenOptions and onto the PassManagerBuilder. This enables gating
the new EliminateAvailableExternally module pass on whether we are
preparing for LTO.
If we are preparing for LTO (e.g. a -flto -c compile), the new pass is not
included as we want to preserve available externally functions for possible
link time inlining.
llvm-svn: 241467
instructions introduced in POWER8.
These are the Clang-related changes for http://reviews.llvm.org/D10704
All builtins are added in altivec.h and guarded with the POWER8_VECTOR macro.
Phabricator review: http://reviews.llvm.org/D10736
llvm-svn: 241293
32-bit finally funclets are intended to be called both directly from the
parent function and indirectly from the EH runtime. Because we aren't
contorting LLVM's X86 prologue to match MSVC's, calling the finally
block directly passes in a different value of EBP than the one that the
runtime provides. We need an adapter thunk to adjust EBP to the expected
value. However, WinEHPrepare already has to solve this problem when
cleanups are not pre-outlined, so we can go ahead and rely on it rather
than duplicating work.
Now we only do the llvm.x86.seh.recoverfp dance for 32-bit SEH filter
functions.
llvm-svn: 241187
This re-lands r236052 and adds support for __exception_code().
In 32-bit SEH, the exception code is not available in eax. It is only
available in the filter function, and now we arrange to load it and
store it into an escaped variable in the parent frame.
As a consequence, we have to disable the "catch i8* null" optimization
on 32-bit and always generate a filter function. We can re-enable the
optimization if we detect an __except block that doesn't use the
exception code, but this probably isn't worth optimizing.
Reviewers: majnemer
Differential Revision: http://reviews.llvm.org/D10852
llvm-svn: 241171
This reinstates part of the hack removed in r233223, by special
casing sse4 as part of the feature additions. The notable change
here is that we consider it only as part of setting the SSE level
and not as part of the actual target features set which handles
setting the rest of the masks.
llvm-svn: 241130
using a string map to canonicalize. Fix up a couple of testcases
that needed changing since we are no longer simply appending features
to the list, but all of their mask dependencies as well.
llvm-svn: 241129
Add intrinsics for the FXSR instructions (FXSAVE/FXSAVE64/FXRSTOR/FXRSTOR64)
These were previously declared in Intrin.h for MSVC compatibility, but now
that we have them implemented, these declarations can be removed.
llvm-svn: 241053
isTriviallyRecursive is a hack used to bridge a gap between the
expectations that source code assumes and the semantics that LLVM IR can
provide. Specifically, asm labels on functions are treated as an
explicit name for a GlobalObject in Clang but treated like an
output-processing step in GCC. Tweak this hack a little further to emit
calls to library functions instead of emitting an incorrect definition.
The definition in question would have available_externally linkage (this
is OK) but result in a call to itself which will either result in an
infinite loop or stack overflow.
This fixes PR23964.
llvm-svn: 241043
This matches the implementation of the gcc support for the same
feature, including checking the values set up by libgcc at runtime.
The structure looks like this:
unsigned int __cpu_vendor;
unsigned int __cpu_type;
unsigned int __cpu_subtype;
unsigned int __cpu_features[1];
with a set of enums to match various fields that are field out after
parsing the output of the cpuid instruction.
This also adds a set of errors checking for valid input (and cpu).
compiler-rt support for this and the other builtins in this family
(__builtin_cpu_init and __builtin_cpu_is) are forthcoming.
llvm-svn: 240994
We failed to see that we should have deferred the creation of a type
which references a type currently under construction because of atomic
sugar.
This fixes PR23985.
llvm-svn: 240989
Several tests wouldn't pass when executed on an armv7a_pc_linux triple
due to the non-default arm_aapcs calling convention produced on the
function definitions in the IR output. Account for this with the
application of a little regex.
Patch by Ying Yi.
llvm-svn: 240971
This patch corresponds to review:
http://reviews.llvm.org/D10637
This is the first round of additions of missing builtins listed in the ABI document. More to come (this builds onto what seurer already addes). This patch adds:
vector signed long long vec_abs(vector signed long long)
vector double vec_abs(vector double)
vector signed long long vec_add(vector signed long long, vector signed long long)
vector unsigned long long vec_add(vector unsigned long long, vector unsigned long long)
vector double vec_add(vector double, vector double)
vector double vec_and(vector bool long long, vector double)
vector double vec_and(vector double, vector bool long long)
vector double vec_and(vector double, vector double)
vector signed long long vec_and(vector signed long long, vector signed long long)
vector double vec_andc(vector bool long long, vector double)
vector double vec_andc(vector double, vector bool long long)
vector double vec_andc(vector double, vector double)
vector signed long long vec_andc(vector signed long long, vector signed long long)
vector double vec_ceil(vector double)
vector bool long long vec_cmpeq(vector double, vector double)
vector bool long long vec_cmpge(vector double, vector double)
vector bool long long vec_cmpge(vector signed long long, vector signed long long)
vector bool long long vec_cmpge(vector unsigned long long, vector unsigned long long)
vector bool long long vec_cmpgt(vector double, vector double)
vector bool long long vec_cmple(vector double, vector double)
vector bool long long vec_cmple(vector signed long long, vector signed long long)
vector bool long long vec_cmple(vector unsigned long long, vector unsigned long long)
vector bool long long vec_cmplt(vector double, vector double)
vector bool long long vec_cmplt(vector signed long long, vector signed long long)
vector bool long long vec_cmplt(vector unsigned long long, vector unsigned long long)
llvm-svn: 240821
Attribute 'nodebug' means no llvm.dbg.* intrinsics, no !dbg
annotations, and no DISubprogram for the function.
Differential Revision: http://reviews.llvm.org/D10747
llvm-svn: 240747
Integer variants are implemented as atomicrmw or cmpxchg instructions.
Atomic add for floating point (__nvvm_atom_add_gen_f()) is implemented
as a call to an overloaded @llvm.nvvm.atomic.load.add.f32.* LVVM
intrinsic.
Differential Revision: http://reviews.llvm.org/D10666
llvm-svn: 240669
The Microsoft-extension _MoveToCoprocessor and _MoveToCoprocessor2
builtins take the register value to be moved as the first argument,
but the corresponding mcr and mcr2 LLVM intrinsics expect that value
to be the third argument. Handle this as a special case, while still
leaving those intrinsics as generic MSBuiltins. I considered the
alternative of handling these in EmitARMBuiltinExpr, but that does
not work well for the follow-up change that I'm going to make to improve
the error handling for PR22560 -- we need the GetBuiltinType() checks
for ICEArguments, and the ARM version of that code is only used for
Neon intrinsics where the last argument is special and not
checked in the normal way.
llvm-svn: 240462
As specified in the SysV AVX512 ABI drafts. It follows the same scheme
as AVX2:
Arguments of type __m512 are split into eight eightbyte chunks.
The least significant one belongs to class SSE and all the others
to class SSEUP.
This also means we change the OpenMP SIMD default alignment on AVX512.
Based on r240337.
Differential Revision: http://reviews.llvm.org/D9894
llvm-svn: 240338
This patch adds initial support for the -fsanitize=kernel-address flag to Clang.
Right now it's quite restricted: only out-of-line instrumentation is supported, globals are not instrumented, some GCC kasan flags are not supported.
Using this patch I am able to build and boot the KASan tree with LLVMLinux patches from github.com/ramosian-glider/kasan/tree/kasan_llvmlinux.
To disable KASan instrumentation for a certain function attribute((no_sanitize("kernel-address"))) can be used.
llvm-svn: 240131
Base type of attribute((mode)) can actually be a vector type.
The patch is to distinguish between base type and base element type.
This fixes http://llvm.org/PR17453.
Differential Revision: http://reviews.llvm.org/D10058
llvm-svn: 240125
This flag controls whether a given sanitizer traps upon detecting
an error. It currently only supports UBSan. The existing flag
-fsanitize-undefined-trap-on-error has been made an alias of
-fsanitize-trap=undefined.
This change also cleans up some awkward behavior around the combination
of -fsanitize-trap=undefined and -fsanitize=undefined. Previously we
would reject command lines containing the combination of these two flags,
as -fsanitize=vptr is not compatible with trapping. This required the
creation of -fsanitize=undefined-trap, which excluded -fsanitize=vptr
(and -fsanitize=function, but this seems like an oversight).
Now, -fsanitize=undefined is an alias for -fsanitize=undefined-trap,
and if -fsanitize-trap=undefined is specified, we treat -fsanitize=vptr
as an "unsupported" flag, which means that we error out if the flag is
specified explicitly, but implicitly disable it if the flag was implied
by -fsanitize=undefined.
Differential Revision: http://reviews.llvm.org/D10464
llvm-svn: 240105
This patch adds the -fsanitize=safe-stack command line argument for clang,
which enables the Safe Stack protection (see http://reviews.llvm.org/D6094
for the detailed description of the Safe Stack).
This patch is our implementation of the safe stack on top of Clang. The
patches make the following changes:
- Add -fsanitize=safe-stack and -fno-sanitize=safe-stack options to clang
to control safe stack usage (the safe stack is disabled by default).
- Add __attribute__((no_sanitize("safe-stack"))) attribute to clang that can be
used to disable the safe stack for individual functions even when enabled
globally.
Original patch by Volodymyr Kuznetsov and others at the Dependable Systems
Lab at EPFL; updates and upstreaming by myself.
Differential Revision: http://reviews.llvm.org/D6095
llvm-svn: 239762
in section 10.1, __arm_{w,r}sr{,p,64}.
This includes arm_acle.h definitions with builtins and codegen to support
these, the intrinsics are implemented by generating read/write_register calls
which get appropriately lowered in the backend based on the register string
provided. SemaChecking is also implemented to fault invalid parameters.
Differential Revision: http://reviews.llvm.org/D9697
llvm-svn: 239737
Instead, just EvaluateAsInt().
Follow-up to r239549: rsmith points out that isICE() is expensive;
seems like it's not the right concept anyway, as it fails on
`static const' in C, and will actually trigger the assert below on:
test/Sema/inline-asm-validate-x86.c
llvm-svn: 239651
Summary:
In addition to easier syntax, IRBuilder makes sure to set correct
debug locations for newly added instructions (bitcast and
llvm.lifetime itself). This restores the original behavior, which
was modified by r234581 (reapplied as r235553).
Extend one of the tests to check for debug locations.
Test Plan: regression test suite
Reviewers: aadg, dblaikie
Subscribers: cfe-commits, majnemer
Differential Revision: http://reviews.llvm.org/D10418
llvm-svn: 239643
Right now we're ignoring the fpmath attribute since there's no
backend support for a feature like this and to do so would require
checking the validity of the strings and doing general subtarget
feature parsing of valid and invalid features with the target
attribute feature.
llvm-svn: 239582
Modeled after the gcc attribute of the same name, this feature
allows source level annotations to correspond to backend code
generation. In llvm particular parlance, this allows the adding
of subtarget features and changing the cpu for a particular function
based on source level hints.
This has been added into the existing support for function level
attributes without particular verification for any target outside
of whether or not the backend will support the features/cpu given
(similar to section, etc).
llvm-svn: 239579
For inline assembly immediate constraints, we currently always use
EmitScalarExpr, instead of directly emitting the constant. When the
overflow sanitizer is enabled, this generates overflow intrinsics
instead of constants.
Instead, emit a constant for constraints that either require an
immediate (e.g. 'I' on X86), or only accepts constants (immediate
or symbolic; i.e., don't accept registers or memory).
Fixes PR19763.
Differential Revision: http://reviews.llvm.org/D10255
llvm-svn: 239549
This patch corresponds to review:
http://reviews.llvm.org/D10095
This is for just two instructions and related builtins:
vbpermq
vgbbd
llvm-svn: 239506
CodeGenOptions and onto the PassManagerBuilder. This enables gating
the new EliminateAvailableExternally module pass on whether we are
preparing for LTO.
If we are preparing for LTO (e.g. a -flto -c compile), the new pass is not
included as we want to preserve available externally functions for possible
link time inlining.
llvm-svn: 239481
Based on previous discussion on the mailing list, clang currently lacks support
for C99 partial re-initialization behavior:
Reference: http://lists.cs.uiuc.edu/pipermail/cfe-dev/2013-April/029188.html
Reference: http://www.open-std.org/jtc1/sc22/wg14/www/docs/dr_253.htm
This patch attempts to fix this problem.
Given the following code snippet,
struct P1 { char x[6]; };
struct LP1 { struct P1 p1; };
struct LP1 l = { .p1 = { "foo" }, .p1.x[2] = 'x' };
// this example is adapted from the example for "struct fred x[]" in DR-253;
// currently clang produces in l: { "\0\0x" },
// whereas gcc 4.8 produces { "fox" };
// with this fix, clang will also produce: { "fox" };
Differential Review: http://reviews.llvm.org/D5789
llvm-svn: 239446
This commit adds back the code that seems to have been dropped unintentionally
in r176985.
rdar://problem/13752163
Differential Revision: http://reviews.llvm.org/D10100
llvm-svn: 239426
The parameter types and return type do not need to be volatile just
because the pointer type's pointee type is volatile qualified. This is
an unnecessary pessimization.
llvm-svn: 238892
We catch most of the various other __fp16 implicit conversions to
float, but not this one:
__fp16 a;
int i;
...
a += i;
For which we used to generate something 'fun' like:
%conv = sitofp i32 %i to float
%1 = tail call i16 @llvm.convert.to.fp16.f32(float %conv)
%add = add i16 %0, %1
Instead, when we have an __fp16 LHS and an integer RHS, we should
use float as the result type.
While there, add a bunch of missing tests for mixed
__fp16/integer expressions.
llvm-svn: 238625
Folding IntToPtr or PtrToInt into Loads, due to r238452,
perturbs the mips-varargs test-case.
Patch by Philip Pfaffe!
Differential Revision: http://reviews.llvm.org/D9153
llvm-svn: 238455
Re-land the change r238200, but with modifications in the tests that should
prevent new failures in some environments as reported with the original
change on the mailing list.
llvm-svn: 238253
Note: __declspec is also temporarily enabled when compiling for a CUDA target because there are implementation details relying on __declspec(property) support currently. When those details change, __declspec should be disabled for CUDA targets.
llvm-svn: 238238
On MIPS unsigned int type should not be zero extended but sign-extended.
Patch by Strahinja Petrovic.
Differential Revision: http://reviews.llvm.org/D9198
llvm-svn: 238200
in POWER8.
These are the Clang-related changes for http://reviews.llvm.org/D9081
vadduqm
vaddeuqm
vaddcuq
vaddecuq
vsubuqm
vsubeuqm
vsubcuq
vsubecuq
All builtins are added in altivec.h, and guarded with the POWER8_VECTOR and
powerpc64 macros.
http://reviews.llvm.org/D9903
llvm-svn: 238145
This patch adds support for the following new instructions in the
Power ISA 2.07:
vpksdss
vpksdus
vpkudus
vpkudum
vupkhsw
vupklsw
These instructions are available through the vec_packs, vec_packsu,
vec_unpackh, and vec_unpackl built-in interfaces. These are
lane-sensitive instructions, so the built-ins have different
implementations for big- and little-endian, and the instructions must
be marked as killing the vector swap optimization for now.
The first three instructions perform saturating pack operations. The
fourth performs a modulo pack operation, which means it can be
represented with a vector shuffle, and conversely the appropriate
vector shuffles may cause this instruction to be generated. The other
instructions are only generated via built-in support for now.
I noticed during patch preparation that the macro __VSX__ was not
previously predefined when the power8-vector or direct-move features
are requested. This is an error, and I've corrected that here as
well.
Appropriate tests have been added.
There is a companion patch to llvm for the rest of this support.
llvm-svn: 237500
Summary:
r235215 enables support in LLVM for legalizing f16 type in the IR. AArch64
already had support for this. r235215 and some backend patches brought support
for ARM, X86, X86-64, Mips and Mips64.
This change exposes the LangOption 'NativeHalfType' in the command line, so the
backend legalization can be used if desired. NativeHalfType is enabled for
OpenCL (current behavior) or if '-fnative-half-type' is set.
Reviewers: olista01, steven_wu, ab
Subscribers: cfe-commits, srhines, aemerson
Differential Revision: http://reviews.llvm.org/D9781
llvm-svn: 237406
Previously we were setting LangOptions::GNUInline (which controls whether we
use traditional GNU inline semantics) if the language did not have the C99
feature flag set. The trouble with this is that C++ family languages also
do not have that flag set, so we ended up setting this flag in C++ modes
(and working around it in a few places downstream by also checking CPlusPlus).
The fix is to check whether the C89 flag is set for the target language,
rather than whether the C99 flag is cleared. This also lets us remove most
CPlusPlus checks. We continue to test CPlusPlus when deciding whether to
pre-define the __GNUC_GNU_INLINE__ macro for consistency with GCC.
There is a change in semantics in two other places
where we weren't checking both CPlusPlus and GNUInline
(FunctionDecl::doesDeclarationForceExternallyVisibleDefinition and
FunctionDecl::isInlineDefinitionExternallyVisible), but this change seems to
put us back into line with GCC's semantics (test case: test/CodeGen/inline.c).
While at it, forbid -fgnu89-inline in C++ modes, as GCC doesn't support it,
it didn't have any effect before, and supporting it just makes things more
complicated.
Differential Revision: http://reviews.llvm.org/D9333
llvm-svn: 237299
GetOutputStream() owns the stream it returns pointer to and the
pointer should never be freed by us. When we fail to load and exit
early, unique_ptr still holds the pointer and frees it which leads to
compiler crash when CompilerInstance attempts to free it again.
Added regression test for failed bitcode linking.
Differential Revision: http://reviews.llvm.org/D9625
llvm-svn: 237159
Fix for codegen of static variables declared inside of captured statements. Captured statements are actually a transparent DeclContexts, so we have to skip them when trying to get a mangled name for statics.
Differential Revision: http://reviews.llvm.org/D9522
llvm-svn: 236701
This adds low-level builtins to allow access to all of the z13 vector
instructions. Note that instructions whose semantics can be described
by standard C (including clang extensions) do not get any builtins.
For each instructions whose semantics *cannot* (fully) be described, we
define a builtin named __builtin_s390_<insn> that directly maps to this
instruction. These are intended to be compatible with GCC.
For instructions that also set the condition code, the builtin will take
an extra argument of type "int *" at the end. The integer pointed to by
this argument will be set to the post-instruction CC value.
For many instructions, the low-level builtin is mapped to the corresponding
LLVM IR intrinsic. However, a number of instructions can be represented
in standard LLVM IR without requiring use of a target intrinsic.
Some instructions require immediate integer operands within a certain
range. Those are verified at the Sema level.
Based on a patch by Richard Sandiford.
llvm-svn: 236532
This patch adds support for the z13 architecture type. For compatibility
with GCC, a pair of options -mvx / -mno-vx can be used to selectively
enable/disable use of the vector facility.
When the vector facility is present, we default to the new vector ABI.
This is characterized by two major differences:
- Vector types are passed/returned in vector registers
(except for unnamed arguments of a variable-argument list function).
- Vector types are at most 8-byte aligned.
The reason for the choice of 8-byte vector alignment is that the hardware
is able to efficiently load vectors at 8-byte alignment, and the ABI only
guarantees 8-byte alignment of the stack pointer, so requiring any higher
alignment for vectors would require dynamic stack re-alignment code.
However, for compatibility with old code that may use vector types, when
*not* using the vector facility, the old alignment rules (vector types
are naturally aligned) remain in use.
These alignment rules are not only implemented at the C language level,
but also at the LLVM IR level. This is done by selecting a different
DataLayout string depending on whether the vector ABI is in effect or not.
Based on a patch by Richard Sandiford.
llvm-svn: 236531
Cyclone actually supports all the goodies you'd expect to come with an AArch64
CPU, so it doesn't need its own clause. Also we should probably be testing
these clauses.
llvm-svn: 236349
by erasing the soft-float target feature if the rest of the front
end added it because of defaults or the soft float option.
Add some testing for some of the targets that implement this hack.
llvm-svn: 236179
LLVM r236120 renamed debug info IR constructs to use a `DI` prefix, now
that the `DIDescriptor` hierarchy has been gone for about a week. This
commit was generated using the rename-md-di-nodes.sh upgrade script
attached to PR23080, followed by running clang-format-diff.py on the
`lib/` portion of the patch.
llvm-svn: 236121
This is just the clang-side of 32-bit SEH. LLVM still needs work, and it
will determinstically fail to compile until it's feature complete.
On x86, all outlined handlers have no parameters, but they do implicitly
take the EBP value passed in and use it to address locals of the parent
frame. We model this with llvm.frameaddress(1).
This works (mostly), but __finally block inlining can break it. For now,
we apply the 'noinline' attribute. If we really want to inline __finally
blocks on 32-bit x86, we should teach the inliner how to untangle
frameescape and framerecover.
Promote the error diagnostic from codegen to sema. It now rejects SEH on
non-Windows platforms. LLVM doesn't implement SEH on non-x86 Windows
platforms, but there's nothing preventing it.
llvm-svn: 236052
When creating a global variable with a type of a struct with bitfields, we must
forcibly set the alignment of the global from the RecordDecl. We must do this so
that the proper bitfield alignment makes its way down to LLVM, since clang will
mangle the bitfields into one large type.
llvm-svn: 235976
This makes sure that the front end is specific about what they're expecting
the backend to produce. Update a FIXME with the idea that the target-features
could be more precise using backend knowledge.
llvm-svn: 235936
In r235553, Clang started emitting lifetime markers more often. This
caused false negative in MSan, because MSan only poisons all allocas
once at function entry. Eventually, MSan should poison allocas at
lifetime start and probably also lifetime end, but until then, let's not
emit markers that aren't going to be useful.
llvm-svn: 235613
Summary:
Make sure signed overflow in "x--" is checked with
llvm.ssub.with.overflow intrinsic and is reported as:
"-2147483648 - 1 cannot be represented in type 'int'"
instead of:
"-2147483648 + -1 cannot be represented in type 'int'"
, like we do for unsigned overflow.
Test Plan: clang + compiler-rt regression test suite
Reviewers: rsmith
Subscribers: cfe-commits
Differential Revision: http://reviews.llvm.org/D8236
llvm-svn: 235568
This reverts commit r234700. It turns out that the lifetime markers
were not the cause of Chromium failing but a bug which was uncovered by
optimizations exposed by the markers.
llvm-svn: 235553
Code in CodeGenModule::GetOrCreateLLVMGlobal that sets up GlobalValue
object for LLVM external symbols has this comment:
// FIXME: This code is overly simple and should be merged with other global
// handling.
One part does seems to be "overly simple" currently is that this code
never sets any alignment info on the GlobalValue, so that the emitted
IR does not have any align attribute on external globals. This can
lead to unnecessarily inefficient code generation.
This patch adds a GV->setAlignment call to set alignment info.
llvm-svn: 235396
SystemZ prefers to align all global variables to two bytes, which is
implemented by setting the TargetInfo member MinGlobalAlign.
However, for compatibility with existing compilers this should *not*
change the ABI alignment value as retrieved via __alignof__, which
it currently does.
This patch fixes the issue by having ASTContext::getDeclAlign ignore
the MinGlobalAlign setting in the ForAlignof case.
Since SystemZ is the only platform setting MinGlobalAlign, this should
cause no change for any other target.
llvm-svn: 235395
Something like { void*, void * } would be passed to a function as a [2 x i64], but returned as an i128. This patch unifies the 2 behaviours so that we also return it as a [2 x i64].
This is better for the quality of the IR, and the size of the final LLVM binary as we tend to want to insert/extract values from these types and do so with the insert/extract instructions is less IR than shifting, truncating, and or'ing values.
Reviewed by Tim Northover.
llvm-svn: 235231
Things can't both be in comdats and have common linkage, so never give things
in comdats common linkage. Common linkage is only used in .c files, and the
only thing that can trigger a comdat in c is selectany from what I can tell.
Fixes PR23243.
Also address an over-the-shoulder review comment from rnk by moving the
hasAttr<SelectAnyAttr>() in Decl.cpp around a bit. It only makes a minor
difference for selectany on global variables, so it goes well with the rest of
this patch.
http://reviews.llvm.org/D9042
llvm-svn: 235053
This patch generates a warning for invalid combination of '-mnan' and
'-march' options, it properly sets NaN encoding for a given '-march',
and it passes a proper NaN encoding to the assembler.
Patch by Vladimir Radosavljevic.
Differential Revision: http://reviews.llvm.org/D8170
llvm-svn: 234882
Even though these symbols are in a comdat group, the Microsoft linker
really wants them to have internal linkage.
I'm planning to tweak the mangling in a follow-up change. This is a
straight revert with a 1-line fix.
llvm-svn: 234613
Now that TailRecursionElimination has been fixed with r222354, the
threshold on size for lifetime marker insertion can be removed. This
only affects named temporary though, as the patch for unnamed temporaries
is still in progress.
My previous commit (r222993) was not handling debuginfo correctly, but
this could only be seen with some asan tests. Basically, lifetime markers
are just instrumentation for the compiler's usage and should not affect
debug information; however, the cleanup infrastructure was assuming it
contained only destructors, i.e. actual code to be executed, and was
setting the breakpoint for the end of the function to the closing '}', and
not the return statement, in order to show some destructors have been
called when leaving the function. This is wrong when the cleanups are only
lifetime markers, and this is now fixed.
llvm-svn: 234581
This patch corresponds to review:
http://reviews.llvm.org/D8398
It adds some builtin functions to access the extended divide and bit permute instructions.
llvm-svn: 234547
WinEHPrepare was going to have to pattern match the control flow merge
and split that the old lowering used, and that wasn't really feasible.
Now we can teach WinEHPrepare to pattern match this, which is much
simpler:
%fp = call i8* @llvm.frameaddress(i32 0)
call void @func(iN [01], i8* %fp)
This prototype happens to match the prototype used by the Win64 SEH
personality function, so this is really simple.
llvm-svn: 234532
The driver currently accepts but ignores the -freciprocal-math flag.
This patch passes the flag through and enables 'arcp' fast-math-flag
generation in IR.
Note that this change does not actually enable the optimization for
any target. The reassociation optimization that this flag specifies
was implemented by http://reviews.llvm.org/D6334 :
http://llvm.org/viewvc/llvm-project?view=revision&revision=222510
Because the optimization is done in the backend rather than IR,
the backend must be modified to understand instruction-level
fast-math-flags or a new function-level attribute must be created.
Also note that -freciprocal-math is independent of any target-specific
usage of reciprocal estimate hardware instructions. That requires
its own flag ('-mrecip').
https://llvm.org/bugs/show_bug.cgi?id=20912
llvm-svn: 234493
Do the same thing as win64. If we're not using COFF, use the ELF
manglings. Maybe if we are targetting *-windows-msvc-macho, we should
use darwin manglings, but I don't need to stir that pot today.
llvm-svn: 233819
The zEC12 provides the transactional-execution facility. This is exposed
to users via a set of builtin routines on other compilers. This patch
adds clang support to enable those builtins. In partciular, the patch:
- enables the transactional-execution feature by default on zEC12
- allows to override presence of that feature via the -mhtm/-mno-htm options
- adds a predefined macro __HTM__ if the feature is enabled
- adds support for the transactional-execution GCC builtins
- adds Sema checking to verify the __builtin_tabort abort code
- adds the s390intrin.h header file (for GCC compatibility)
- adds s390 sections to the htmintrin.h and htmxlintrin.h header files
Since this is first use of target-specific intrinsics on the platform,
the patch creates the include/clang/Basic/BuiltinsSystemZ.def file and
hooks it up in TargetBuiltins.h and lib/Basic/Targets.cpp.
An associated LLVM patch adds the required LLVM IR intrinsics.
For reference, the transactional-execution instructions are documented
in the z/Architecture Principles of Operation for the zEC12:
http://publibfp.boulder.ibm.com/cgi-bin/bookmgr/download/DZ9ZR009.pdf
The associated builtins are documented in the GCC manual:
http://gcc.gnu.org/onlinedocs/gcc/S_002f390-System-z-Built-in-Functions.html
The htmxlintrin.h intrinsics provided for compatibility with the IBM XL
compiler are documented in the "z/OS XL C/C++ Programming Guide".
llvm-svn: 233804
Add Tool and ToolChain support for clang to target the NaCl OS using the NaCl
SDK for x86-32, x86-64 and ARM.
Includes nacltools::Assemble and Link which are derived from gnutools. They
are similar to Linux but different enought that they warrant their own class.
Also includes a NaCl_TC in ToolChains derived from Generic_ELF with library
and include paths suitable for an SDK and independent of the system tools.
Differential Revision: http://reviews.llvm.org/D8590
llvm-svn: 233594
The argument range checks for the HTM and Crypto builtins were implemented in
CGBuiltin.cpp, not in Sema. This change moves them to the appropriate location
in SemaChecking.cpp. It requires the creation of a new method in the Sema class
to do checks for PPC-specific builtins.
http://reviews.llvm.org/D8672
llvm-svn: 233586
Test cases must not check for symbolic variable names that are not
present in IR generated by no-assert builds.
Fixed by testing a more complete subset of the va_arg dataflow,
without relying on variable names.
llvm-svn: 233574
Running the GCC's inter-compiler ABI compatibility test suite uncovered
a couple of errors in clang's SystemZ ABI implementation. These all
affect only rare corner cases:
- Short vector types
GCC synthetic vector types defined with __attribute__ ((vector_size ...))
are always passed and returned by reference. (This is not documented in
the official ABI document, but is the de-facto ABI implemented by GCC.)
clang would do that only for vector sizes >= 16 bytes, but not for shorter
vector types.
- Float-like aggregates and empty bitfields
clang would consider any aggregate containing an empty bitfield as
first element to be a float-like aggregate. That's obviously wrong.
According to the ABI doc, the presence of an empty bitfield makes
an aggregate to be *not* float-like. However, due to a bug in GCC,
empty bitfields are ignored in C++; this patch changes clang to be
compatible with this "feature" of GCC.
- Float-like aggregates and va_arg
The va_arg implementation would mis-detect some aggregates as float-like
that aren't actually passed as such. This applies to aggregates that
have only a single element of type float or double, but using an aligned
attribute that increases the total struct size to more than 8 bytes.
This error occurred because the va_arg implement used to have an copy
of the float-like aggregate detection logic (i.e. it would call the
isFPArgumentType routine, but not perform the size check).
To simplify the logic, this patch removes the duplicated logic and
instead simply checks the (possibly coerced) LLVM argument type as
already determined by classifyArgumentType.
llvm-svn: 233543
Eric Christopher pointed out that we have a check for assembly code
generation in a clang test, which isn't cool. We already have Driver
and back-end CodeGen tests for the .abiversion handling, so this
testing is unnecessary anyway. Make it go away.
llvm-svn: 233314
This patch adds Hardware Transaction Memory (HTM) support supported by ISA 2.07
(POWER8). The intrinsic support is based on GCC one [1], with both 'PowerPC HTM
Low Level Built-in Functions' and 'PowerPC HTM High Level Inline Functions'
implemented.
Along with builtins a new driver switch is added to enable/disable HTM
instruction support (-mhtm) and a header with common definitions (mostly to
parse the TFHAR register value). The HTM switch also sets a preprocessor builtin
HTM.
The HTM usage requires a recently newer kernel with PPC HTM enabled. Tested on
powerpc64 and powerpc64le.
This is send along a llvm patch to enabled the builtins and option switch.
[1]
https://gcc.gnu.org/onlinedocs/gcc/PowerPC-Hardware-Transactional-Memory-Built-in-Functions.html
Phabricator Review: http://reviews.llvm.org/D8248
llvm-svn: 233205
I'm about to commit a patch for:
http://reviews.llvm.org/D8567
That patch will break this one existing test case in Clang.
I'm not sure if this file is intending to create a Clang
dependency on the LLVM IR optimizer, but that's the
consequence of specifying -O3 on this test file.
My hope is to avoid buildbot rage by removing this check,
committing the LLVM patch, and then fixing this check.
I don't know how to make a simultaneous commit to Clang
and LLVM.
I will commit the correct CHECK line fix for this test
shortly.
llvm-svn: 233109
PS4 target recognizes the #pragma comment() syntax as in -fms-extensions, but
only handles the case of #pragma comment(lib). This patch adds a warning if any
other arguments are encountered.
This patch also refactors the code in ParsePragma.cpp a little bit to make it
more obvious that some codes are being shared between -fms-extensions and PS4.
llvm-svn: 233015
On AArch64, the -fallow-half-args-and-returns option is the default.
With it, the half type is considered legal (rather than the i16 used
normally for __fp16), but no operation is, except conversions and
load/stores and such.
The previous behavior was tantamount to saying LangOpts.NativeHalfType
was implied by LangOpts.HalfArgsAndReturns, which isn't true.
Instead, teach the various parts of CodeGen that already know about
half (using the intrinsics or not) about this weird in-between case,
where the "half" type is legal, but operations on it aren't.
This is a smaller intermediate step to the end-goal of removing the
intrinsic, always using "half", and letting the backend legalize.
Builds on r232968.
rdar://20045970, rdar://17468714
Differential Revision: http://reviews.llvm.org/D8367
llvm-svn: 232971
Fix the CodeGen so that for types bigger than float, instead of
converting to fp16 via the sequence "InTy -> float -> fp16", we
perform conversions in just one step. This avoids the double
rounding which potentially changes results from a natural
IEEE-754 operation.
rdar://17594379, rdar://17468714
Differential Revision: http://reviews.llvm.org/D4602
Part of: http://reviews.llvm.org/D8367
llvm-svn: 232968
the target-cpu, if different from the triple's cpu, and
target-features as they're written that are passed down from the
driver.
Together with LLVM r232885 this should allow the LTO'ing of binaries
that contain modules compiled with different code generation options
on a subset of architectures with full backend support (x86, powerpc,
aarch64).
llvm-svn: 232888
Somehow, we never managed to implement this fully. We could constant
fold it like crazy, including constant folding complex arguments, etc.
But if you actually needed to generate code for it, error.
I've implemented it using the somewhat obvious lowering. Happy for
suggestions on a more clever way to lower this.
Now, what you might ask does this have to do with modules? Fun story. So
it turns out that libstdc++ actually uses __builtin_isinf_sign to
implement std::isinf when in C++98 mode, but only inside of a template.
So if we're lucky, and we never instantiate that, everything is good.
But once we try to instantiate that template function, we need this
builtin. All of my customers at least are using C++11 and so they never
hit this code path.
But what does that have to do with modules? Fun story. So it turns out
that with modules we actually observe a bunch of bugs in libstdc++ where
their <cmath> header clobbers things exposed by <math.h>. To fix these,
we have to provide global function definitions to replace the macros
that C99 would have used. And it turns out that ::isinf needs to be
implemented using the exact semantics used by the C++98 variant of
std::isinf. And so I started to fix this bug in libstdc++ and ceased to
be able to compile libstdc++ with Clang.
The yaks are legion.
llvm-svn: 232778
location data is available. If pragma handling wants to look up the
position, it finds the LLVM buffer and wants to compare it with the
special built-in buffer, failing badly. Extend to the special handling
of the built-in buffer to also check for the inline asm buffer. Expect
only a single asm buffer. Sort it between the built-in buffers and the
normal file buffers.
Fixes the assert part of PR 22576.
llvm-svn: 232389
In preparation for recommit of revision 232190, change tests so that they
are resilient to operands being commuted by the reassociate pass.
llvm-svn: 232206
This is nearly identical to the v*f128_si256 parts of r231792 and r232052.
AVX2 introduced proper integer variants of the hacked integer insert/extract
C intrinsics that were created for this same functionality with AVX1.
This should complete the front end fixes for insert/extract128 intrinsics.
Corresponding LLVM patch to follow.
llvm-svn: 232109
This is very much like D8088 (checked in at r231792).
Now that we've replaced the vinsertf128 intrinsics,
do the same for their extract twins.
Differential Revision: http://reviews.llvm.org/D8275
llvm-svn: 232052
Support for the QPX vector instruction set, used on the IBM BG/Q supercomputer,
has recently been added to the LLVM PowerPC backend. This vector instruction
set requires some ABI modifications because the ABI on the BG/Q expects
<4 x double> vectors to be provided with 32-byte stack alignment, and to be
handled as native vector types (similar to how Altivec vectors are handled on
mainline PPC systems). I've named this ABI variant elfv1-qpx, have made this
the default ABI when QPX is supported, and have updated the ABI handling code
to provide QPX vectors with the correct stack alignment and associated
register-assignment logic.
llvm-svn: 231960
We want to replace as much custom x86 shuffling via intrinsics
as possible because pushing the code down the generic shuffle
optimization path allows for better codegen and less complexity
in LLVM.
This is the sibling patch for the LLVM half of this change:
http://reviews.llvm.org/D8086
Differential Revision: http://reviews.llvm.org/D8088
llvm-svn: 231792
This is a recommit of r231150, reverted in r231409. Turns out
that -fsanitize=shift-base check implementation only works if the
shift exponent is valid, otherwise it contains undefined behavior
itself.
Make sure we check that exponent is valid before we proceed to
check the base. Make sure that we actually report invalid values
of base or exponent if -fsanitize=shift-base or
-fsanitize=shift-exponent is specified, respectively.
llvm-svn: 231711
When passing a type with large alignment byval, we were specifying the type's
alignment rather than the alignment that the backend is actually capable of
producing (ABIAlign).
This would be OK (if odd) assuming the backend dealt with it prooperly,
unfortunately it doesn't and trying to pass types with "byval align 16" can
cause it to set fp incorrectly and trash the stack during the prologue. I'll be
fixing that in a separate patch, but Clang should still be emitting IR that's
as close to its intent as possible.
rdar://20059039
llvm-svn: 231706
It's not that easy. If we're only checking -fsanitize=shift-base we
still need to verify that exponent has sane value, otherwise
UBSan-inserted checks for base will contain undefined behavior
themselves.
llvm-svn: 231409
Opt in Win64 to supporting sjlj lowering. We have the backend lowering,
so I think this was just an oversight because WinX86_64TargetCodeGenInfo
doesn't inherit from X86_64TargetCodeGenInfo.
llvm-svn: 231280
This test doesn't provide any value (it just checks that the frontend
produces exactly one compile unit), and it certainly isn't doing what
the comment says. Noticed via IRC review of my update to it in r231083.
llvm-svn: 231152
-fsanitize=shift is now a group that includes both these checks, so
exisiting users should not be affected.
This change introduces two new UBSan kinds that sanitize only left-hand
side and right-hand side of shift operation. In practice, invalid
exponent value (negative or too large) tends to cause more portability
problems, including inconsistencies between different compilers, crashes
and inadequeate results on non-x86 architectures etc. That is,
-fsanitize=shift-exponent failures should generally be addressed first.
As a bonus, this change simplifies CodeGen implementation for emitting left
shift (separate checks for base and exponent are now merged by the
existing generic logic in EmitCheck()), and LLVM IR for these checks
(the number of basic blocks is reduced).
llvm-svn: 231150
Originally we were using the same GCC builtins to lower this AVX2 vector
intrinsic. Instead we will now lower it directly to a vector shuffle.
This will not only allow LLVM to generate better code, but it will also allow us
to remove the GCC intrinsics.
Reviewed by Andrea
This is related to rdar://problem/18742778.
llvm-svn: 231081
For global reg lvalue - use regular store through global register.
For simple lvalue - use simple atomic store.
For bitfields, vector element, extended vector elements - the original value of the whole storage (for vector elements) or of some aligned value (for bitfields) is atomically read, the part of this value for the given lvalue is modified and then use atomic compare-and-exchange operation to try to atomically write modified value (if it was not modified).
Also, changes in this patch fix the bug for '#pragma omp atomic read' applied to extended vector elements.
Differential Revision: http://reviews.llvm.org/D7369
llvm-svn: 230736
The __finally emission block tries to be clever by removing unused continuation
edges if there's an unconditional jump out of the __finally block. With
exception edges, the EH continuation edge isn't always unused though and we'd
crash in a few places.
Just don't be clever. That makes the IR for __finally blocks a bit longer in
some cases (hence small and behavior-preserving changes to existing tests), but
it makes no difference in general and it fixes the last crash from PR22553.
http://reviews.llvm.org/D7918
llvm-svn: 230697
Currently, the NaN values emitted for MIPS architectures do not cover
non-IEEE754-2008 compliant case. This change fixes the issue.
Patch by Vladimir Radosavljevic.
Differential Revision: http://reviews.llvm.org/D7882
llvm-svn: 230653
Original CL description:
Produce less broken basic block sequences for __finally blocks.
The way cleanups (such as PerformSEHFinally) get emitted is that codegen
generates some initialization code, then calls the cleanup's Emit() with the
insertion point set to a good place, then the cleanup is supposed to emit its
stuff, and then codegen might tack in a jump or similar to where the insertion
point is after the cleanup.
The PerformSEHFinally cleanup tries to just stash away the block it's supposed
to codegen into, and then does codegen later, into that stashed block. However,
after codegen'ing the __finally block, it used to set the insertion point to
the finally's continuation block (where the __finally cleanup goes when its body
is completed after regular, non-exceptional control flow). That's not correct,
as that block can (and generally does) already ends in a jump. Instead,
remember the insertion point that was current before the __finally got emitted,
and restore that.
Fixes two of the crashes in PR22553.
llvm-svn: 230503
The way cleanups (such as PerformSEHFinally) get emitted is that codegen
generates some initialization code, then calls the cleanup's Emit() with the
insertion point set to a good place, then the cleanup is supposed to emit its
stuff, and then codegen might tack in a jump or similar to where the insertion
point is after the cleanup.
The PerformSEHFinally cleanup tries to just stash away the block it's supposed
to codegen into, and then does codegen later, into that stashed block. However,
after codegen'ing the __finally block, it used to set the insertion point to
the finally's continuation block (where the __finally cleanup goes when its body
is completed after regular, non-exceptional control flow). That's not correct,
as that block can (and generally does) already ends in a jump. Instead,
remember the insertion point that was current before the __finally got emitted,
and restore that.
Fixes two of the crashes in PR22553.
llvm-svn: 230460
This is a necessary prerequisite for debugging with modules.
The .pcm files become containers that hold the serialized AST which allows
us to store debug information in the module file that can be shared by all
object files that were built importing the module.
This reapplies r230044 with a fixed configure+make build and updated
dependencies and testcase requirements. Over the last iteration this
version adds
- missing target requirements for testcases that specify an x86 triple,
- a missing clangCodeGen.a dependency to libClang.a in the make build.
rdar://problem/19104245
llvm-svn: 230423
The backend should now be able to handle all AAPCS rules based on argument
type, which means Clang no longer has to duplicate the register-counting logic
and the CodeGen can be significantly simplified.
llvm-svn: 230349
MSVC does not support C99 _Complex.
ICC, however, does support it on windows x86_64, and treats it, for purposes of parameter passing, as equivalent to a struct containing two fields (for the real and imaginary part).
Differential Revision: http://reviews.llvm.org/D7825
llvm-svn: 230315
llvm.eh.sjlj.setjmp / llvm.eh.sjlj.longjmp, if the backend is known to
support them outside the Exception Handling context. The default
handling in LLVM codegen doesn't work and will create incorrect code.
The ARM backend on the other hand will assert if the intrinsics are
used.
llvm-svn: 230255
For now -funique-section-names is the default, so no change in default behavior.
The total .o size in a build of llvm and clang goes from 241687775 to 230649031
bytes if -fno-unique-section-names is used.
llvm-svn: 230031
Summary:
The definition for _mm256_insert_epi64 was taking an int, which would get
truncated before being inserted in the vector.
Original patch by Joshua Magee!
Reviewers: bruno, craig.topper
Subscribers: cfe-commits
Differential Revision: http://reviews.llvm.org/D7179
llvm-svn: 229811
Not all targets generate 'store atomic' instructions for
'_Atomic(_Complex int)'. Some targets use the __atomic_store builtin instead.
This commit makes the test accept either one.
llvm-svn: 229676
This is a patch for PR22563 ( http://llvm.org/bugs/show_bug.cgi?id=22563 ).
We were not correctly unwrapping a single 256-bit AVX vector that was defined as an array of 1 inside a struct.
We would generate a <4 x float> param/return value instead of <8 x float> and lose half of the vector.
Differential Revision: http://reviews.llvm.org/D7614
llvm-svn: 229408
For #pragma comment(linker, ...) MSVC expects the comment string to be quoted, but for #pragma comment(lib, ...) the compiler itself quotes the library name.
Since this distinction disappears by the time the directive reaches the backend, move quoting for the "lib" version to the frontend.
Differential Revision: http://reviews.llvm.org/D7653
llvm-svn: 229376
Bools are a little tricky, they are i8 in memory and must be coerced
back to i1 before further operations can be performed on them.
This fixes PR22577.
llvm-svn: 229204
The first change won't touch GEPOperators such as these, but the update
script only identifies them by the leading '(' after getelementptr or
'getelementptr inbounds', so update this test to at least have those
features to allow auto-migrating.
llvm-svn: 229198
The /volatile:ms semantics turn volatile loads and stores into atomic
acquire and release operations. This distinction is important because
volatile memory operations do not form a happens-before relationship
with non-atomic memory. This means that a volatile store is not
sufficient for implementing a mutex unlock routine.
Differential Revision: http://reviews.llvm.org/D7580
llvm-svn: 229082
Summary:
This patch installs an InlineAsmDiagnosticsHandler to avoid the crash
report when the input is bitcode and the bitcode contains invalid inline
assembly. The handler will simply print the same error message that will
print from the backend.
Add CHECK in test-case
Reviewers: echristo, rafael
Reviewed By: rafael
Subscribers: rafael, cfe-commits
Differential Revision: http://reviews.llvm.org/D7568
llvm-svn: 228898
a non-uniqueable temporary node that is only turned into a permanent
unique or distinct node after it is finished.
Otherwise an intermediate node may get accidentally uniqued with another
node as illustrated by the testcase.
Paired commit with LLVM.
llvm-svn: 228855
Also removed unused builtins.
Original patch by Andrea Di Biagio!
Reviewers: craig.topper, nadav
Subscribers: cfe-commits
Differential Revision: http://reviews.llvm.org/D7199
llvm-svn: 228481
modifiers on them. If we have a matching output constraint with
an early clobber make sure we don't propagate that to the input
constraint.
llvm-svn: 228422
After r228258, Clang started emitting C++ EH IR that LLVM wasn't ready
to deal with, even when exceptions were disabled with /EHs-. This time,
make /EHs- turn off -fexceptions while still emitting exceptional
constructs in functions using __try. Since Sema rejects C++ exception
handling constructs before CodeGen, landingpads should only appear in
such functions as the result of a __try.
llvm-svn: 228329
Previously we would simply double-emit the body of the __finally block,
but that doesn't work when it contains any kind of Decl, which we can't
double emit.
This fixes that by emitting the block once and branching into a shared
code region and then branching back out.
llvm-svn: 228222
Summary:
Named registers with the constraint "=&r" currently lose the early clobber flag
and turn into "=r" when converted to LLVM-IR. This patch correctly passes it on.
Reviewers: atanasyan
Reviewed By: atanasyan
Subscribers: cfe-commits
Differential Revision: http://reviews.llvm.org/D7346
llvm-svn: 228143
Create a new TargetCodeGenInfo for Windows on ARM to permit annotating the
functions with stack-probe-size (for /Gs and -mstack-probe-support) for
generating the stack probe necessary for Windows targets. This will be used by
the backend when lowering the frame to generate the stack probe appropriately.
llvm-svn: 227641
On targets which use the MSVCRT, setjmp is a macro which expands to
_setjmp or _setjmpex.
_setjmp and _setjmpex have a secret, hidden argument which is not listed
in the function prototype on X64 and WoA. This hidden argument always
seems to be the frame pointer.
_setjmpex isn't used on X86, _setjmp is magically replaced with a call
to _setjmp3. The second argument is zero for 'normal' setjmp/longjmp
pairs, otherwise it is a count of additional variadic arguments. This
is used when setjmp appears inside of a try or __try.
It is not safe to use a pointer to setjmp because _setjmp, _setjmpex and
_setmp3 are not compatible with setjmp.
llvm-svn: 227426
Summary:
It was used for interoperability with PNaCl's calling conventions, but
it's no longer needed.
Also Remove NaCl*ABIInfo which just existed to delegate to either the portable
or native ABIInfo, and remove checkCallingConvention which was now a no-op
override.
Reviewers: jvoung
Subscribers: jfb, llvm-commits
Differential Revision: http://reviews.llvm.org/D7206
llvm-svn: 227362
The backend won't run LowerExpect on -O0. In a debug LTO build, this results in llvm.expect intrinsics being in the LTO IR which doesn't know how to optimize them.
Thanks to Chandler for the suggestion and review.
Differential revision: http://reviews.llvm.org/D7183
llvm-svn: 227135
The lowering looks a lot like normal EH lowering, with the exception
that the exceptions are caught by executing filter expression code
instead of matching typeinfo globals. The filter expressions are
outlined into functions which are used in landingpad clauses where
typeinfo would normally go.
Major aspects that still need work:
- Non-call exceptions in __try bodies won't work yet. The plan is to
outline the __try block in the frontend to keep things simple.
- Filter expressions cannot use local variables until capturing is
implemented.
- __finally blocks will not run after exceptions. Fixing this requires
work in the LLVM SEH preparation pass.
The IR lowering looks like this:
// C code:
bool safe_div(int n, int d, int *r) {
__try {
*r = normal_div(n, d);
} __except(_exception_code() == EXCEPTION_INT_DIVIDE_BY_ZERO) {
return false;
}
return true;
}
; LLVM IR:
define i32 @filter(i8* %e, i8* %fp) {
%ehptrs = bitcast i8* %e to i32**
%ehrec = load i32** %ehptrs
%code = load i32* %ehrec
%matches = icmp eq i32 %code, i32 u0xC0000094
%matches.i32 = zext i1 %matches to i32
ret i32 %matches.i32
}
define i1 zeroext @safe_div(i32 %n, i32 %d, i32* %r) {
%rr = invoke i32 @normal_div(i32 %n, i32 %d)
to label %normal unwind to label %lpad
normal:
store i32 %rr, i32* %r
ret i1 1
lpad:
%ehvals = landingpad {i8*, i32} personality i32 (...)* @__C_specific_handler
catch i8* bitcast (i32 (i8*, i8*)* @filter to i8*)
%ehptr = extractvalue {i8*, i32} %ehvals, i32 0
%sel = extractvalue {i8*, i32} %ehvals, i32 1
%filter_sel = call i32 @llvm.eh.seh.typeid.for(i8* bitcast (i32 (i8*, i8*)* @filter to i8*))
%matches = icmp eq i32 %sel, %filter_sel
br i1 %matches, label %eh.except, label %eh.resume
eh.except:
ret i1 false
eh.resume:
resume
}
Reviewers: rjmccall, rsmith, majnemer
Differential Revision: http://reviews.llvm.org/D5607
llvm-svn: 226760
It fails on Windows due to another temporary being emitted first, so the
LLVM internal renaming scheme gives out the name
__block_descriptor_tmp1.
llvm-svn: 226757
Currently we emit DeferredDeclsToEmit in reverse order. This patch changes that.
The advantages of the change are that
* The output order is a bit closer to the source order. The change to
test/CodeGenCXX/pod-member-memcpys.cpp is a good example.
* If we decide to deffer more, it will not cause as large changes in the
estcases as it would without this patch.
llvm-svn: 226751
Analogous to AVX2, these need to be implemented as macros to properly
propagate the immediate index operand.
Part of <rdar://problem/17688758>
llvm-svn: 226496
Summary:
This fixes MultiSource/Applications/lemon on big-endian N32 by correcting the
handling of the argument to wait(). glibc defines it as a transparent union of
void* and int*. Such unions are passed according to the rules of the first
member so the argument must be passed as if it were a void* (sign extended from
i32 to i64) and not as a union (shifted to the upper bits of an i64).
wait() already behaves correctly on big-endian O32 and N64 since the union is
already the same size as an argument slot.
Reviewers: atanasyan
Reviewed By: atanasyan
Subscribers: cfe-commits
Differential Revision: http://reviews.llvm.org/D6963
llvm-svn: 225981
These are implemented with __builtin_shufflevector just like AVX.
We have some tests on the LLVM side to assert that these shufflevectors do
indeed generate the corresponding unpck instruction.
Part of <rdar://problem/17688758>
llvm-svn: 225922
Summary:
The Mips ABI's treat pointers in the same way as integers. They are
sign-extended to 32-bit for O32, and 64-bit for N32/N64. This doesn't matter
for O32 and N64 where pointers are already the correct width but it does matter
for big-endian N32, where pointers are 32-bit and need promoting.
The caller side is already passing pointers correctly. This patch corrects the
callee.
Reviewers: vmedic, atanasyan
Reviewed By: atanasyan
Subscribers: cfe-commits
Differential Revision: http://reviews.llvm.org/D6812
llvm-svn: 225782
Introduce the following -fsanitize-recover flags:
- -fsanitize-recover=<list>: Enable recovery for selected checks or
group of checks. It is forbidden to explicitly list unrecoverable
sanitizers here (that is, "address", "unreachable", "return").
- -fno-sanitize-recover=<list>: Disable recovery for selected checks or
group of checks.
- -f(no-)?sanitize-recover is now a synonym for
-f(no-)?sanitize-recover=undefined,integer and will soon be deprecated.
These flags are parsed left to right, and mask of "recoverable"
sanitizer is updated accordingly, much like what we do for -fsanitize= flags.
-fsanitize= and -fsanitize-recover= flag families are independent.
CodeGen change: If there is a single UBSan handler function, responsible
for implementing multiple checks, which have different recoverable setting,
then we emit two handler calls instead of one:
the first one for the set of "unrecoverable" checks, another one - for
set of "recoverable" checks. If all checks implemented by a handler have the
same recoverability setting, then the generated code will be the same.
llvm-svn: 225719
The llvm IR until recently had no support for comdats. This was a problem when
targeting C++ on ELF/COFF as just using weak linkage would cause quite a bit of
dead bits to remain on the executable (unless -ffunction-sections,
-fdata-sections and --gc-sections were used).
To fix the problem, llvm's codegen will just assume that any weak or linkonce
that is not in an explicit comdat should be output in one with the same name as
the global.
This unfortunately breaks cases like pr19848 where a weak symbol is not
xpected to be part of any comdat.
Now that we have explicit comdats in the IR, we can finally get both cases
right.
This first patch just makes clang give explicit comdats to GlobalValues where
t is allowed to.
A followup patch to llvm will then stop implicitly producing comdats.
llvm-svn: 225705