Pointer-sized alignment is sufficient as we only ever read single values
from the table. Otherwise we'd bump the alignment to 16 bytes in the
backend if the vtable is larger than 16 bytes. This is great for
structures that are accessed with vector instructions or copied around, but
that's simply not the case for vtables.
Shrinks the data segment of a Release x86_64 clang by 0.3%. The wins are
larger for i386 and code bases that use vtables more often than we do.
This matches the behavior of GCC 5.
llvm-svn: 217495
Summary:
This patch implements a new UBSan check, which verifies
that function arguments declared to be nonnull with __attribute__((nonnull))
are actually nonnull in runtime.
To implement this check, we pass FunctionDecl to CodeGenFunction::EmitCallArgs
(where applicable) and if function declaration has nonnull attribute specified
for a certain formal parameter, we compare the corresponding RValue to null as
soon as it's calculated.
Test Plan: regression test suite
Reviewers: rsmith
Reviewed By: rsmith
Subscribers: cfe-commits, rnk
Differential Revision: http://reviews.llvm.org/D5082
llvm-svn: 217389
There were code paths that are duplicated for constructors and destructors just
because we have both CXXCtorType and CXXDtorsTypes.
This patch introduces an unified enum and reduces code deplication a bit.
llvm-svn: 217383
This makes use of the recently-added @llvm.assume intrinsic to implement a
__builtin_assume(bool) intrinsic (to provide additional information to the
optimizer). This hooks up __assume in MS-compatibility mode to mirror
__builtin_assume (the semantics have been intentionally kept compatible), and
implements GCC's __builtin_assume_aligned as assume((p - o) & mask == 0). LLVM
now contains special logic to deal with assumptions of this form.
llvm-svn: 217349
This patch adds support for the 32bit numeric max/min and directed round-to-integral NEON intrinsics that were added as part of v8, along with unit tests.
Patch by Graham Hunter!
llvm-svn: 217242
For naked functions with parameters, Clang would still emit stores in the prologue
that would clobber the stack, because LLVM doesn't set up a stack frame. (This
shows up in -O0 compiles, because the stores are optimized away otherwise.)
For example:
__attribute__((naked)) int f(int x) {
asm("movl $42, %eax");
asm("retl");
}
Would result in:
_Z1fi:
movl 12(%esp), %eax
movl %eax, (%esp) <--- Oops.
movl $42, %eax
retl
Differential Revision: http://reviews.llvm.org/D5183
llvm-svn: 217198
If control falls off the end of a function after an __asm block, MSVC
assumes that the inline assembly filled the EAX and possibly EDX
registers with an appropriate return value. This functionality is used
in inline functions returning 64-bit integers in system headers, so we
need some amount of compatibility.
This is implemented in Clang by adding extra output constraints to every
inline asm block, and storing the resulting output registers into the
return value slot. If we see an asm block somewhere in the function
body, we emit a normal epilogue instead of marking the end of the
function with a return type unreachable.
Normal returns in functions not using this functionality will overwrite
the return value slot, and in most cases LLVM should be able to
eliminate the dead stores.
Fixes PR17201.
Reviewed By: majnemer
Differential Revision: http://reviews.llvm.org/D5177
llvm-svn: 217187
Summary:
This allows us to easily find them in the backend after the aggregates have
been lowered to other types. This is important on big-endian targets using
the N32/N64 ABI's since these ABI's must shift small structures into the
upper bits of the register.
Reviewers: atanasyan
Reviewed By: atanasyan
Subscribers: llvm-commits
Differential Revision: http://reviews.llvm.org/D5005
llvm-svn: 217160
Summary:
They are returned indirectly which causes the other arguments to move to
the next argument slot.
With this, utils/ABITest does not discover any failing cases in the first
500 attempts on big/little endian for O32. Previously some of these failed.
Also tested N32/N64 little endian (big endian has other known issues) with
no issues.
Reviewers: atanasyan
Reviewed By: atanasyan
Subscribers: atanasyan, cfe-commits
Differential Revision: http://reviews.llvm.org/D4811
llvm-svn: 217147
Using the intrinsic allows the SelectionDAGBuilder to turn this call
into the FABS Node and also the intrinsic is something the vectorizer knows
how to vectorize.
This patch also sets the readnone attribute on this call, which should
enable additional optmizations.
llvm-svn: 217042
This avoids encoding information about the function prototype into the
thunk at the cost of some function prototype bitcast gymnastics.
Fixes PR20653.
llvm-svn: 216782
In theory, it'd be nice if we could move to a case where all buried
pointers were buried via unique_ptr to demonstrate that the program had
finished with the value (that we could really have cleanly deallocated
it) but instead chose to bury it.
I think the main reason that's not possible right now is the various
IntrusiveRefCntPtrs in the Frontend, sharing ownership for a variety of
compiler bits (see the various similar
"CompilerInstance::releaseAndLeak*" functions). I have yet to figure out
their correct ownership semantics - but perhaps, even if the
intrusiveness can be removed, the shared ownership may yet remain and
that would lead to a non-unique burying as is there today. (though we
could model that a little better - by passing in a shared_ptr, etc -
rather than needing the two step that's currently used in those other
releaseAndLeak* functions)
This might be a bit more robust if BuryPointer took the boolean:
BuryPointer(bool, unique_ptr<T>)
and the choice to bury was made internally - that way, even when
DisableFree was not set, the unique_ptr would still be null in the
caller and there'd be no chance of accidentally having a different
codepath where the value is used after burial in !DisableFree, but it
becomes null only in DisableFree, etc...
llvm-svn: 216742
Previously, EnterStructPointerForCoercedAccess used Alloc size when determining how to convert. This was problematic, because there were situations were the alloc size was larger than the store size. For example, if the first element of a structure were i24 and the destination type were i32, the old code would generate a GEP and a load i24. The code should compare store sizes to ensure the whole object is loaded. I have attached a test case.
This patch modifies the output of arm64-be-bitfield.c test case, but the new IR seems to be equivalent, and after -O3, the compiler generates identical ARM assembly. (asr x0, x0, #54)
Patch by Thomas Jablin!
llvm-svn: 216722
Summary:
We did a great job getting this wrong:
- We messed up which LLVM IR types to use for arguments and return values.
The optimized libcalls use integer types for values.
Clang attempted to use the IR type which corresponds to the value
passed in instead of using an appropriately sized integer type. This
would result in violations of the ABI for, as an example, floating
point types.
- We didn't bother recording the result of the atomic libcall in the
destination memory.
Instead, call the functions with arguments matching the type of the
libcall prototype's parameters.
This fixes PR20780.
Differential Revision: http://reviews.llvm.org/D5098
llvm-svn: 216714
Summary:
The current implementation of asan cookie is incorrect:
we add nosanitize metadata to the cookie load, but the metadata may be lost
and we will instrument the load from poisoned memory.
This change replaces the load with a call to __asan_load_cxx_array_cookie (r216692)
Reviewers: rsmith
Reviewed By: rsmith
Subscribers: cfe-commits
Differential Revision: http://reviews.llvm.org/D5111
llvm-svn: 216702
For the following code:
__declspec(dllimport) int f(int x);
int user(int x) {
return f(x);
}
int f(int x) { return 1; }
Clang will drop the dllimport attribute in the AST, but CodeGen would have
already put it on the LLVM::Function, and that would never get updated.
(The same thing happens for global variables.)
This makes Clang check dropped DLL attribute case each time the LLVM object
is referenced.
This isn't perfect, because we will still get it wrong if the function is
never referenced by codegen after the attribute is dropped, but this handles
the common cases and makes us not fail in the verifier.
llvm-svn: 216699
linkage related to generation of OBJC_SELECTOR_REFERENCES symbol
needed in generation of call to 'super' in a class method.
// rdar://18150301
llvm-svn: 216676
ACLE 2.0 allows __fp16 to be used as a function argument or return
type. This enables this for AArch64.
This also fixes an existing bug that causes clang to not allow
homogeneous floating-point aggregates with a base type of __fp16. This
is valid for AAPCS64, but not for AAPCS-VFP.
llvm-svn: 216558
This tidies up some ARM-specific code added by r208417 to move it out
of the target-independent parts of clang into TargetInfo.cpp. This
also has the advantage that we can now flatten struct arguments to
variadic AAPCS functions.
llvm-svn: 216535
This time though, preserve the extension for bool types since that's compatible
with what MSVC expects.
See http://reviews.llvm.org/D4380
llvm-svn: 216507
Summary: When the main file is created from a membuffer, there is no file entry that can be retrieved. This uses "__GLOBAL_I_a" in that case which is what was always used before r208128.
Reviewers: majnemer, thakis
Reviewed By: thakis
Subscribers: yaron.keren, rsmith, cfe-commits
Differential Revision: http://reviews.llvm.org/D5043
llvm-svn: 216495
Summary:
MSVC doesn't extend integer types smaller than 64bit, so to preserve
binary compatibility, clang shouldn't either.
For example, the following C code built with MSVC:
unsigned test(unsigned v);
unsigned foobar(unsigned short);
int main() { return test(0xffffffff) + foobar(28); }
Produces the following:
0000000000000004: B9 FF FF FF FF mov ecx,0FFFFFFFFh
0000000000000009: E8 00 00 00 00 call test
000000000000000E: 89 44 24 20 mov dword ptr [rsp+20h],eax
0000000000000012: 66 B9 1C 00 mov cx,1Ch
0000000000000016: E8 00 00 00 00 call foobar
And as you can see, when setting up the call to foobar, only cx is overwritten.
If foobar is compiled with clang, then the zero extension added by clang means
the rest of the register, which contains garbage, could be used.
For example if foobar is:
unsigned foobar(unsigned short v) {
return v;
}
Compiled with clang -fomit-frame-pointer -O3 gives the following assembly:
foobar:
0000000000000000: 89 C8 mov eax,ecx
0000000000000002: C3 ret
And that function would return garbage because the 16 most significant bits of
ecx still contain garbage from the first call.
With this change, the code for that function is now:
foobar:
0000000000000000: 0F B7 C1 movzx eax,cx
0000000000000003: C3 ret
Reviewers: chapuni, rnk
Reviewed By: rnk
Subscribers: majnemer, cfe-commits
Differential Revision: http://reviews.llvm.org/D4380
llvm-svn: 216491
Summary:
PR19838
When operator new[] is called and an array cookie is created
we want asan to detect buffer overflow bugs that touch the cookie.
For that we need to
a) poison the shadow for the array cookie (call __asan_poison_cxx_array_cookie).
b) ignore the legal accesses to the cookie generated by clang (add 'nosanitize' metadata)
Reviewers: timurrrr, samsonov, rsmith
Reviewed By: rsmith
Subscribers: cfe-commits
Differential Revision: http://reviews.llvm.org/D4774
llvm-svn: 216434
into EmitCXXMemberOrOperatorCall methods. In the end we want
to make declaration visible in EmitCallArgs() method, that
would allow us to alter CodeGen depending on function/parameter
attributes.
No functionality change.
llvm-svn: 216404
PowerPC uses the special PPC_FP128 type for long double on Linux, which is
composed of two 64-bit doubles. The higher-order double (which contains the
overall sign) comes first, and so the __builtin_signbitl implementation
requires special handling to extract the sign bit.
Fixes PR20691.
llvm-svn: 216341
Similar to r215768 (which fixed the same case for while loops). To quote
r215768's commit message:
"A little test case simplification - this could be simplified further,
though there are certainly interesting connections to the if/else
construct so I'm hesitant to remove that entirely though it does appear
somewhat unrelated.
(similar fix to r215766, related to PR19864)"
llvm-svn: 216297
for loops introduce two scopes - one for the outer loop variable and its
initialization, and another for the body of the loop, including any
variable declared inside the loop condition.
llvm-svn: 216288
Summary:
This refactoring introduces ClangToLLVMArgMapping class, which
encapsulates the information about the order in which function arguments listed
in CGFunctionInfo should be passed to actual LLVM IR function, such as:
1) positions of sret, if there is any
2) position of inalloca argument, if there is any
3) position of helper padding argument for each call argument
4) positions of regular argument (there can be many if it's expanded).
Simplify several related methods (ConstructAttributeList, EmitFunctionProlog
and EmitCall): now they don't have to maintain iterators over the list
of LLVM IR function arguments, dealing with all the sret/inalloca/this complexities,
and just use expected positions of LLVM IR arguments stored in ClangToLLVMArgMapping.
This may increase the running time of EmitFunctionProlog, as we have to traverse
expandable arguments twice, but in further refactoring we will be able
to speed up EmitCall by passing already calculated CallArgsToIRArgsMapping to
ConstructAttributeList, thus avoiding traversing expandable argument there.
No functionality change.
Test Plan: regression test suite
Reviewers: majnemer, rnk
Reviewed By: rnk
Subscribers: cfe-commits, rjmccall, timurrrr
Differential Revision: http://reviews.llvm.org/D4938
llvm-svn: 216251
Summary:
This is a first small step towards passing generic "Expr" instead of
ArgBeg/ArgEnd pair into EmitCallArgs() family of methods. Having "Expr" will
allow us to get the corresponding FunctionDecl and its ParmVarDecls,
thus allowing us to alter CodeGen depending on the function/parameter
attributes.
No functionality change.
Test Plan: regression test suite
Reviewers: rnk
Reviewed By: rnk
Subscribers: aemerson, cfe-commits
Differential Revision: http://reviews.llvm.org/D4915
llvm-svn: 216214
The profile data format was recently updated and the new indexing api
requires the code coverage tool to know the function's hash as well
as the function's name to get the execution counts for a function.
Differential Revision: http://reviews.llvm.org/D4995
llvm-svn: 216208
A little test case simplification - this could be simplified further,
though there are certainly interesting connections to the if/else
construct so I'm hesitant to remove that entirely though it does appear
somewhat unrelated.
(similar fix to r215766, related to PR19864)
llvm-svn: 215768
This avoids debuggers stepping to strange places (like the last
statement in the loop body, or the first statement in the if).
This is not the whole answer, though - similar bugs no doubt exist in
other loops (patches to follow) and attributing exception handling code
to the correct line is also tricky (based on the previous fix to
PR19864, exception handling is still erroneously attributed to the 'if'
line).
llvm-svn: 215766
It fits better with LLVM's memory model to try to do this in the
backend. Specifically, narrowing wide loads in the backends should be
relatively straightforward and is generally valuable, whereas widening
loads tends to be very constrained.
Discussion here:
http://lists.cs.uiuc.edu/pipermail/cfe-commits/Week-of-Mon-20140811/112581.html
This reverts commit r215614.
llvm-svn: 215648
Currently when laying out bitfields that don't need any padding, we
represent them as a wide enough int to contain all of the bits. This
can be hard on the backend since we'll do things like represent stores
to a few bits as loading an i144, masking it with a large constant,
and storing it back.
This turns up in less pathological cases where we load and mask 64 bit
word on a 32 bit platform when we actually only need to access 32 bits.
This leads to bad code being generated in most of our 32 bit backends.
In practice, there are often natural breaks in bitfields, and it's a
fairly simple and effective heuristic to split these fields into legal
integer sized chunks when it will be equivalent (ie, it won't force us
to add any extra padding).
llvm-svn: 215614
recursively within the emission of another inline function. This ultimately
led to us emitting the same inline function definition twice, which we then
rejected because we believed we had a mangled name conflict.
llvm-svn: 215579
Summary:
This patch adds a runtime check verifying that functions
annotated with "returns_nonnull" attribute do in fact return nonnull pointers.
It is based on suggestion by Jakub Jelinek:
http://lists.cs.uiuc.edu/pipermail/llvm-commits/Week-of-Mon-20140623/223693.html.
Test Plan: regression test suite
Reviewers: rsmith
Reviewed By: rsmith
Subscribers: cfe-commits
Differential Revision: http://reviews.llvm.org/D4849
llvm-svn: 215485
It cannot be compiled on Visual Studio 2012.
clang\include\clang/Frontend/CompilerInstance.h(153):
error C2248: 'std::unique_ptr<_Ty>::unique_ptr' : cannot access private member declared in class 'std::unique_ptr<_Ty>'
with
[
_Ty=llvm::raw_ostream
]
D:\Program Files (x86)\Microsoft Visual Studio 11.0\VC\include\memory(1447) : see declaration of 'std::unique_ptr<_Ty>::unique_ptr'
with
[
_Ty=llvm::raw_ostream
]
This diagnostic occurred in the compiler generated function 'clang::CompilerInstance::OutputFile::OutputFile(const clang::CompilerInstance::OutputFile &)'
llvm-svn: 215346
After post-commit review and community discussion, this seems like a
reasonable direction to continue, making ownership semantics explicit in
the source using the type system.
llvm-svn: 215323
Due to the possible presence of return-by-out parameters, using the LLVM
argument number count when numbering debug info arguments can end up
off-by-one. This could produce two arguments with the same number, which
would in turn cause LLVM to emit only one of those arguments (whichever
it found last) or assert (r215157).
llvm-svn: 215227
MSVC doesn't decide what the inheritance model for a returned member
pointer *until* a call expression returns it.
This fixes PR20017.
llvm-svn: 215164
There are no vtable offset offsets in the MS ABI, but vbtable offsets
are analogous. There are no consumers of this information yet, but at
least we don't crash now.
llvm-svn: 215149
This reverts commit r215137.
This doesn't work at all, an offset-offset is probably different than an
offset. I'm scared that this passed our normal test suite.
llvm-svn: 215141
to instruct the code generator to not enforce a higher alignment
than the given number (of bytes) when accessing memory via an opaque
pointer or reference. Patch reviewed by John McCall (with post-commit
review pending). rdar://16254558
llvm-svn: 214911
This patch adds the '-fcoverage-mapping' option which
allows clang to generate the coverage mapping information
that can be used to provide code coverage analysis using
the execution counts obtained from the instrumentation
based profiling (-fprofile-instr-generate).
llvm-svn: 214752
Instead of creating global variables for source locations and global names,
just create metadata nodes and strings. They will be transformed into actual
globals in the instrumentation pass (if necessary). This approach is more
flexible:
1) we don't have to ensure that our custom globals survive all the optimizations
2) if globals are discarded for some reason, we will simply ignore metadata for them
and won't have to erase corresponding globals
3) metadata for source locations can be reused for other purposes: e.g. we may
attach source location metadata to alloca instructions and provide better descriptions
for stack variables in ASan error reports.
No functionality change.
llvm-svn: 214604
We've added support for a multiple functions with the same name in
LLVM's profile data, so the lookup returning the function hash it
found doesn't make sense anymore. Update to pass in the hash we
expect.
This also adds a test that the version 1 format is still readable,
since the new API is expected to handle that.
llvm-svn: 214586
It is responsible for generating metadata consumed by sanitizer instrumentation
passes in the backend. Move several methods from CodeGenModule to SanitizerMetadata.
For now the class is stateless, but soon it won't be the case.
Instead of creating globals providing source-level information to ASan, we will create
metadata nodes/strings which will be turned into actual global variables in the
backend (if needed).
No functionality change.
llvm-svn: 214564
It appears that the backend does not handle all cases that were handled by clang.
In particular, it does not handle structs as used in
SingleSource/UnitTests/2003-05-07-VarArgs.
llvm-svn: 214512
Summary:
This patch causes clang to emit va_arg instructions to the backend instead of
expanding them into an implementation itself. The backend already implements
va_arg since this is necessary for NaCl so this patch is removing redundant
code.
Together with the llvm patch (D4556) that accounts for the effect of endianness
on the expansion of va_arg, this fixes PR19612.
Depends on D4556
Reviewers: sstankovic, dsanders
Reviewed By: dsanders
Subscribers: rnk, cfe-commits
Differential Revision: http://reviews.llvm.org/D4742
llvm-svn: 214497
they're somehow missing a body. Looks like this was left behind when the loop
was generalized, and it's not been problematic before because without modules,
a used, implicit special member function declaration must be a definition.
This was resulting in us trying to emit a constructor declaration rather than
a definition, and producing a constructor missing its member initializers.
llvm-svn: 214473
This patch is necessary to support constant expressions which replaces the integer value in the loop hint attribute with an expression. The integer value was also storing the pragma’s state for options like vectorize(enable/disable) and the pragma unroll and nounroll directive. The state variable is introduced to hold the state of those options/pragmas. This moves the validation of the state (keywords) from SemaStmtAttr handler to the loop hint annotation token handler.
Resubmit with changes to try to fix the build-bot issue.
Reviewed by Aaron Ballman
llvm-svn: 214432
or a class derived from T. We already supported this when initializing
_Atomic(T) from T for most (and maybe all) other reasonable values of T.
llvm-svn: 214390
This patch is necessary to support constant expressions which replaces the integer value in the loop hint attribute with an expression. The integer value was also storing the pragma’s state for options like vectorize(enable/disable) and the pragma unroll and nounroll directive. The state variable is introduced to hold the state of those options/pragmas. This moves the validation of the state (keywords) from SemaStmtAttr handler to the loop hint annotation token handler.
Reviewed by Aaron Ballman
llvm-svn: 214333
As defined in the SPIR 1.2 specification, this node behaves similarly to
kernel_arg_type but will print the underlying type name, e.g., without
typedefs.
Example:
typedef unsigned int myunsignedint;
would report:
'myunsignedint' in the kernel_arg_type node
'uint' in the kernel_arg_base_type node
llvm-svn: 214308
This fixes a bug where kernel_arg_type was always changing 'unsigned ' to 'u'
for any parameter type, including non-canonical types.
Example:
typedef unsigned int myunsignedint;
would report:
"myunt"
llvm-svn: 214305
The MS ABI has a notion of 'required alignment' for fields; this
alignment supercedes pragma pack directives.
MSVC takes into account alignment attributes on typedefs when
determining whether or not a field has a certain required alignment.
Do the same in clang by tracking whether or not we saw such an attribute
when calculating the type's bitwidth and alignment.
This fixes PR20418.
Reviewers: rnk
Differential Revision: http://reviews.llvm.org/D4714
llvm-svn: 214274
Merge vrshr_n_v and vqshlu_n_v with ARM.
Remove FIXME comments for others as they can't actually be shared.
NFC.
Differential Revision: http://reviews.llvm.org/D4697
llvm-svn: 214173
This broke the following gdb tests:
gdb.base__annota1.exp
gdb.base__consecutive.exp
gdb.python__py-symtab.exp
gdb.reverse__consecutive-precsave.exp
gdb.reverse__consecutive-reverse.exp
I will look into this.
This reverts commit 214162.
llvm-svn: 214163
This allows us to give more precise diagnostics.
Diego kindly tested the impact on debug info size: "The increase on average
debug sizes is 0.1%. The total file size increase is ~0%."
llvm-svn: 214162
This isn't nearly as elaborate as the GCC inline asm which emits an
array of source locations, but it's very, very hard to trigger backend
diagnostics in MS inline asm because we parse it up front with good
source information, unlike GCC inline asm.
Currently I can trigger a "inline assembly requires more registers than
available" diagnostic with this code:
void foo();
void bar() {
__asm pusha
__asm call foo
__asm popa
}
However, if I committed that as a test case, I would have to remove it
once I fix PR20052.
llvm-svn: 214141
While Clang now supports both ELFv1 and ELFv2 ABIs, their use is currently
hard-coded via the target triple: powerpc64-linux is always ELFv1, while
powerpc64le-linux is always ELFv2.
These are of course the most common scenarios, but in principle it is
possible to support the ELFv2 ABI on big-endian or the ELFv1 ABI on
little-endian systems (and GCC does support that), and there are some
special use cases for that (e.g. certain Linux kernel versions could
only be built using ELFv1 on LE).
This patch implements the Clang side of supporting this, based on the
LLVM commit 214072. The command line options -mabi=elfv1 or -mabi=elfv2
select the desired ABI if present. (If not, Clang uses the same default
rules as now.)
Specifically, the patch implements the following changes based on the
presence of the -mabi= option:
In the driver:
- Pass the appropiate -target-abi flag to the back-end
- Select the correct dynamic loader version (/lib64/ld64.so.[12])
In the preprocessor:
- Define _CALL_ELF to the appropriate value (1 or 2)
In the compiler back-end:
- Select the correct ABI in TargetInfo.cpp
- Select the desired ABI for LLVM via feature (elfv1/elfv2)
llvm-svn: 214074
This moves some memptr specific code into the generic thunk emission
codepath.
Fixes PR20053.
Reviewers: majnemer
Differential Revision: http://reviews.llvm.org/D4613
llvm-svn: 214004
llvm revision 210639 renamed the -global-merge backend option to
-enable-global-merge. This change simply updates clang to match that.
Patch by Steven Wu!
llvm-svn: 213993
Previously we were building up the inalloca struct in the usual pattern
of return type followed by arguments. However, on Windows, 'this'
always precedes the 'sret' parameter, so we need to insert it into the
struct first as a special case.
llvm-svn: 213990
The target method of the thunk will perform the cleanup. This can't be
tested in 32-bit x86 yet because passing something by value would create
an inalloca, and we refuse to generate broken code for that.
llvm-svn: 213976
While -fno-rtti-data would correctly avoid referencing the RTTI complete
object locator in the VFTable itself, it would emit them anyway.
llvm-svn: 213841
The main subtlety here is that the Darwin tools still need to be given "-arch
arm64" rather than "-arch aarch64". Fortunately this already goes via a custom
function to handle weird edge-cases in other architectures, and it tested.
I removed a few arm64_be tests because that really isn't an interesting thing
to worry about. No-one using big-endian is also referring to the target as
arm64 (at least as far as toolchains go). Mostly they date from when arm64 was
a separate target and we *did* need a parallel name simply to test it at all.
Now aarch64_be is sufficient.
llvm-svn: 213744
Summary:
This pragma is very rare. We could *hypothetically* lower some uses of
it down to @llvm.global_ctors, but given that GlobalOpt isn't able to
optimize prioritized global ctors today, there's really no point.
If we wanted to do this in the future, I would check if the section used
in the pragma started with ".CRT$XC" and had up to two characters after
it. Those two characters could form the 16-bit initialization priority
that we support in @llvm.global_ctors. We would have to teach LLVM to
lower prioritized global ctors on COFF as well.
This should let us compile some silly uses of this pragma in WebKit /
Blink.
Reviewers: rsmith, majnemer
Subscribers: cfe-commits
Differential Revision: http://reviews.llvm.org/D4549
llvm-svn: 213593
In addition to enabling ELFv2 homogeneous aggregate handling,
LLVM support to pass array types directly also enables a performance
enhancement. We can now pass (non-homogeneous) aggregates that fit
fully in registers as direct integer arrays, using an element type
to encode the alignment requirement (that would otherwise go to the
"byval align" field).
This is preferable since "byval" forces the back-end to write the
aggregate out to the stack, even if it could be passed fully in
registers. This is particularly annoying on ELFv2, if there is
no parameter save area available, since we then need to allocate
space on the callee's stack just to hold those aggregates.
Note that to implement this optimization, this patch does not attempt
to fully anticipate register allocation rules as (defined in the
ABI and) implemented in the back-end. Instead, the patch is simply
passing *any* aggregate passed by value using the array mechanism
if its size is up to 64 bytes. This means that some of those will
end up being passed in stack slots anyway, but the generated code
shouldn't be any worse either. (*Large* aggregates remain passed
using "byval" to enable optimized copying via memcpy etc.)
llvm-svn: 213495
This patch implements clang support for the PowerPC ELFv2 ABI.
Together with a series of companion patches in LLVM, this makes
clang/LLVM fully usable on powerpc64le-linux.
Most of the ELFv2 ABI changes are fully implemented on the LLVM side.
On the clang side, we only need to implement some changes in how
aggregate types are passed by value. Specifically, we need to:
- pass (and return) "homogeneous" floating-point or vector aggregates in
FPRs and VRs (this is similar to the ARM homogeneous aggregate ABI)
- return aggregates of up to 16 bytes in one or two GPRs
The second piece is trivial to implement in any case. To implement
the first piece, this patch makes use of infrastructure recently
enabled in the LLVM PowerPC back-end to support passing array types
directly, where the array element type encodes properties needed to
handle homogeneous aggregates correctly.
Specifically, the array element type encodes:
- whether the parameter should be passed in FPRs, VRs, or just
GPRs/stack slots (for float / vector / integer element types,
respectively)
- what the alignment requirements of the parameter are when passed in
GPRs/stack slots (8 for float / 16 for vector / the element type
size for integer element types) -- this corresponds to the
"byval align" field
With this support in place, the clang part simply needs to *detect*
whether an aggregate type implements a float / vector homogeneous
aggregate as defined by the ELFv2 ABI, and if so, pass/return it
as array type using the appropriate float / vector element type.
llvm-svn: 213494
In C99, an array parameter declarator might have the form:
direct-declarator '[' 'static' type-qual-list[opt] assign-expr ']'
where the static keyword indicates that the caller will always provide a
pointer to the beginning of an array with at least the number of elements
specified by the assignment expression. For constant sizes, we can use the
new dereferenceable attribute to pass this information to the optimizer. For
VLAs, we don't know the size, but (for addrspace(0)) do know that the pointer
must be nonnull (and so we can use the nonnull attribute).
llvm-svn: 213444
Thoroughly check for a pointer dereference which yields a glvalue. Look
through casts, comma operators, conditional operators, paren
expressions, etc.
This was originally D4416.
Differential Revision: http://reviews.llvm.org/D4592
llvm-svn: 213434
This reverts commit r213401, r213402, r213403, and r213404.
I accidently committed these changes instead of updating the
differential.
llvm-svn: 213405
Summary:
Thoroughly check for a pointer dereference which yields a glvalue. Look
through casts, comma operators, conditional operators, paren
expressions, etc.
Reviewers: rsmith
Subscribers: cfe-commits
Differential Revision: http://reviews.llvm.org/D4416
llvm-svn: 213401
Clang uses a diagnostic handler to grab diagnostic messages so it can print them
with the line of source code they refer to. This patch extends this to handle
optimization failures that were added to llvm to produce a warning when
loop vectorization is explicitly specified (using a pragma clang loop directive)
but fails.
Update renames warning flag name to avoid indicating the flag's severity and
adds a test.
Reviewed by Alp Toker
llvm-svn: 213400
Otherwise -fsanitize=vptr causes the program to crash when it downcasts
a null pointer.
Reviewed in http://reviews.llvm.org/D4412.
Patch by Byoungyoung Lee!
llvm-svn: 213393
Summary:
This change adds description of globals created by UBSan
instrumentation (UBSan handlers, type descriptors, filenames) to
llvm.asan.globals metadata, effectively "blacklisting" them. This can
dramatically decrease the data section in binaries built with UBSan+ASan,
as UBSan tends to create a lot of handlers, and ASan instrumentation
increases the global size to at least 64 bytes.
Test Plan: clang regression test suite
Reviewers: rsmith
Reviewed By: rsmith
Subscribers: cfe-commits, byoungyoung, kcc
Differential Revision: http://reviews.llvm.org/D4575
llvm-svn: 213392
Because references must be initialized using some evaluated expression, they
must point to something, and a callee can assume the reference parameter is
dereferenceable. Taking advantage of a new attribute just added to LLVM, mark
them as such.
Because dereferenceability in addrspace(0) implies nonnull in the backend, we
don't need both attributes. However, we need to know the size of the object to
use the dereferenceable attribute, so for incomplete types we still emit only
nonnull.
llvm-svn: 213386
r211898 introduced a regression where a large struct, which would
normally be passed ByVal, was causing padding to be inserted to
prevent the backend from using some GPRs, in order to follow the
AAPCS. However, the type of the argument was not being set correctly,
so the backend cannot align 8-byte aligned struct types on the stack.
The fix is to not insert the padding arguments when the argument is
being passed ByVal.
llvm-svn: 213359
This reverts commit r213307.
Reverting to have some on-list discussion/confirmation about the ongoing
direction of smart pointer usage in the LLVM project.
llvm-svn: 213325
(after fixing a bug in MultiplexConsumer I noticed the ownership of the
nested consumers was implemented with raw pointers - so this fixes
that... and follows the source back to its origin pushing unique_ptr
ownership up through there too)
llvm-svn: 213307
This makes us emit dllexported in-class initialized static data members (which
are treated as definitions in MSVC), even when they're not referenced.
It also makes their special linkage reflected in the GVA linkage instead of
getting massaged in CodeGen.
Differential Revision: http://reviews.llvm.org/D4563
llvm-svn: 213304
This is used to mark the instructions emitted by Clang to implement
variety of UBSan checks. Generally, we don't want to instrument these
instructions with another sanitizers (like ASan).
Reviewed in http://reviews.llvm.org/D4544
llvm-svn: 213291
-- a constructor list initialization that unpacked an initializer list into
constructor arguments and
-- a list initialization that created as std::initializer_list and passed it
as the first argument to a constructor
in the AST. Use this flag while instantiating templates to provide the right
semantics for the resulting initialization.
llvm-svn: 213224
Clang supports __assume, at least at the semantic level, when MS extensions are
enabled. Unfortunately, trying to actually compile code using __assume would
result in this error:
error: cannot compile this builtin function yet
__assume is an optimizer hint, and can be ignored at the IR level. Until LLVM
supports assumptions at the IR level, a noop lowering is valid, and that is
what is done here.
llvm-svn: 213206
to be applied to class or protocols. This will direct IRGen
for Objective-C metadata to use the new name in various places
where class and protocol names are needed.
rdar:// 17631257
llvm-svn: 213167
Clang uses a diagnostic handler to grab diagnostic messages so it can print them
with the line of source code they refer to. This patch extends this to handle
diagnostic warnings that were added to llvm to produce a warning when
loop vectorization is explicitly specified (using a pragma clang loop directive)
but fails.
Reviewed by: Aaron Ballman
llvm-svn: 213112
This patch implements __builtin_arm_nop intrinsic for AArch32 and AArch64,
which generates hint 0x0, the alias of NOP instruction.
This intrinsic is necessary to implement ACLE __nop intrinsic.
Differential Revision: http://reviews.llvm.org/D4495
llvm-svn: 212947
Previously, we would have a private backing variable and an internal
alias pointing at it.
However, -fdata-sections only fires if a global variable has non-private
linkage. This means that an unreferenced vftable wouldn't get
discarded, bloating the object file.
Instead, stick the backing variable in a comdat even if the alias has
internal linkage. This will allow the linker to drop the vftable if it
is unused.
llvm-svn: 212901
Currently ASan instrumentation pass creates a string with global name
for each instrumented global (to include global names in the error report). Global
name is already mangled at this point, and we may not be able to demangle it
at runtime (e.g. there is no __cxa_demangle on Android).
Instead, create a string with fully qualified global name in Clang, and pass it
to ASan instrumentation pass in llvm.asan.globals metadata. If there is no metadata
for some global, ASan will use the original algorithm.
This fixes https://code.google.com/p/address-sanitizer/issues/detail?id=264.
llvm-svn: 212872
OS X TLS has all accesses going through the thread-wrapper function and
gives the backing thread-local variable internal linkage. This means
that thread-wrappers must have WeakAnyLinkage so that references to the
internal thread-local variables do not get propagated to other code.
It also means that translation units which do not provide a definition
for the thread-local variable cannot attempt to emit a thread-wrapper
because the thread wrapper will attempt to reference the backing
variable.
Differential Revision: http://reviews.llvm.org/D4109
llvm-svn: 212841