pointer, but such folding encounters side-effects, ignore the side-effects
rather than performing them at runtime: CodeGen generates wrong code for
__builtin_object_size in that case.
llvm-svn: 157310
Currently cold functions are marked with the "optsize" attribute in CodeGen
so they are always optimized for size. The hot attribute is just ignored,
LLVM doesn't have a way to express hotness at the moment.
llvm-svn: 156723
A vector should be returned via the hidden pointer argument except if its size
is equal to or smaller than 16-bytes and the target ABI is N32 or N64.
llvm-svn: 156642
It reduces the amount of emitted debug information:
1) DIEs in .debug_info have types DW_TAG_compile_unit, DW_TAG_subprogram,
DW_TAG_inlined_subroutine (for opt builds) and DW_TAG_lexical_block only.
2) .debug_str contains only function names.
3) No debug data for types/namespaces/variables is emitted.
4) The data in .debug_line is enough to produce valid stack traces with
function names and line numbers.
Reviewed by Eric Christopher.
llvm-svn: 156160
goodness because it provides opportunites to cleanup things. For example,
uint64_t t1(__m128i vA)
{
uint64_t Alo;
_mm_storel_epi64((__m128i*)&Alo, vA);
return Alo;
}
was generating
movq %xmm0, -8(%rbp)
movq -8(%rbp), %rax
and now generates
movd %xmm0, %rax
rdar://11282581
llvm-svn: 155924
i32 __builtin_annotation(i32, string);
Applying it to i64 (e.g., long long) generates the following IR.
trunc i64 {{.*}} to i32
call i32 @llvm.annotation.i32
zext i32 {{.*}} to i64
The redundant truncation and extension make the result difficult to use.
This patch makes __builtin_annotation() generic.
type __builtin_annotation(type, string);
For the i64 example, it simplifies the generated IR to:
call i64 @llvm.annotation.i64
Patch by Xi Wang!
llvm-svn: 155764
more aligned than the block header but also contains something
smaller than the block-header alignment but not exactly half
the difference between the large alignment and the header
alignment. Got that?
I'm really not sure what I was thinking with the buggy computation
here, but the fix is pretty obvious.
llvm-svn: 155662
With -fno-math-errno (the default for Darwin) or -ffast-math these library
function can be marked readnone enabling more opportunities for CSE and other
optimizations.
rdar://11251464
llvm-svn: 155498
__atomic_test_and_set, __atomic_clear, plus a pile of undocumented __GCC_*
predefined macros.
Implement library fallback for __atomic_is_lock_free and
__c11_atomic_is_lock_free, and implement __atomic_always_lock_free.
Contrary to their documentation, GCC's __atomic_fetch_add family don't
multiply the operand by sizeof(T) when operating on a pointer type.
libstdc++ relies on this quirk. Remove this handling for all but the
__c11_atomic_fetch_add and __c11_atomic_fetch_sub builtins.
Contrary to their documentation, __atomic_test_and_set and __atomic_clear
take a first argument of type 'volatile void *', not 'void *' or 'bool *',
and __atomic_is_lock_free and __atomic_always_lock_free have an argument
of type 'const volatile void *', not 'void *'.
With this change, libstdc++4.7's <atomic> passes libc++'s atomic test suite,
except for a couple of libstdc++ bugs and some cases where libc++'s test
suite tests for properties which implementations have latitude to vary.
llvm-svn: 154640
This is not quite sufficient for libstdc++'s <atomic>: we still need
__atomic_test_and_set and __atomic_clear, and may need a more complete
__atomic_is_lock_free implementation.
We are also missing an implementation of __atomic_always_lock_free,
__atomic_nand_fetch, and __atomic_fetch_nand, but those aren't needed
for libstdc++.
llvm-svn: 154579
<stdatomic.h> header.
In passing, fix LanguageExtensions to note that C11 and C++11 are no longer
"upcoming standards" but are now actually standardized.
llvm-svn: 154513
and emit a relatively empty block for a plain break statement. This
enables us to track where we went through a switch.
PR9796 & rdar://11215207
llvm-svn: 154420
section. A 'normal' string will go into the __TEXT,__const section, but this
isn't good for UTF16 strings. The __ustring section allows for coalescing, among
other niceties (such as allowing the linker to easily split up strings).
Instead of outputting the UTF16 string as a series of bytes, output it as a
series of shorts. The back-end will then nicely place the UTF16 string into the
correct section, because it's a mensch.
<rdar://problem/10655949>
llvm-svn: 153710
LLVM intrinsics for.
I have an implementation of these functions, which wants to go in a libgcc_s
equivalent in compiler-rt. It's currently here:
http://people.freebsd.org/~theraven/atomic.c
It will be committed to compiler-rt as soon as I work out where would be a
sensible place to put it...
llvm-svn: 153666
For i686 targets (eg. cygwin), I saw "Range must not be empty!" in verifier.
It produces (i32)[0x80000000:0x80000000) from (uint64_t)[0xFFFFFFFF80000000ULL:0x0000000080000000ULL), for signed i32 on MDNode::Range.
llvm-svn: 153382
-fno-inline-functions.
This behaves much like -fno-inline in gcc, but based on a discussion with
Daniel it was decided that -fno-inline-functions should subsume -fno-inline.
Please speak up if you object. The -fno-inline flag remains ignored.
Final part of rdar://10972766
llvm-svn: 152754
- We do this when it is easy to determine that the backend will pass them on
the stack properly by itself.
Currently LLVM codegen is really bad in some cases with byval, for example, on
the test case here (which is derived from Sema code, which likes to pass
SourceLocations around)::
struct s47 { unsigned a; };
void f47(int,int,int,int,int,int,struct s47);
void test47(int a, struct s47 b) { f47(a, a, a, a, a, a, b); }
we used to emit code like this::
...
movl %esi, -8(%rbp)
movl -8(%rbp), %ecx
movl %ecx, (%rsp)
...
to handle moving the struct onto the stack, which is just appalling.
Now we generate::
movl %esi, (%rsp)
which seems better, no?
llvm-svn: 152462
by the BAA pass, which uses the default TargetLibraryInfo constructor.
Unfortunately, the default TargetLibraryInfo constructor assumes all library
calls are available and thus ignores -fno-builtin.
rdar://10947759
llvm-svn: 151745
The bug that was caught by Apple's internal buildbots was valid and also showed another bug in my implementation.
These are now fixed, with regression tests added to catch them both (not Darwin-specific).
Original log:
====================
Revert r151638 because it causes assertion hit on PCH creation for Cocoa.h
Original log:
---------------------
Correctly track tags and enum members defined in the prototype of a function, and ensure they are properly scoped.
This fixes code such as:
enum e {x, y};
int f(enum {y, x} n) {
return 0;
}
This finally fixes PR5464 and PR5477.
---------------------
I also reverted r151641 which was enhancement on top of r151638.
====================
llvm-svn: 151712
Original log:
---------------------
Correctly track tags and enum members defined in the prototype of a function, and ensure they are properly scoped.
This fixes code such as:
enum e {x, y};
int f(enum {y, x} n) {
return 0;
}
This finally fixes PR5464 and PR5477.
---------------------
I also reverted r151641 which was enhancement on top of r151638.
llvm-svn: 151667
match the behavior of GCC. Also add a test for these intrinsics, which
apparently have *zero* tests. =[ Not surprisingly, Clang crashed when
compiling these.
Fix the bug in CodeGen where we failed to bitcast the argument type to
x86mmx prior to calling the LLVM intrinsic. This fixes an assert on the
new 3dnow-builtins.c test.
This is one issue impacting the efforts to get Clang to emulate the
Microsoft intrinsics headers -- 3dnow intrinsics are implictitly made
available there.
llvm-svn: 150948
This changes function prolog in such a way as to avoid out-of-bounds
stack store in the case when coerce-to type has a larger storage size
than the real argument type.
Fixes PR11905.
llvm-svn: 150238
We had been generating load/store instructions with the default alignment
for the vector element type, even when the pointer argument had less alignment.
<rdar://problem/10538555>
llvm-svn: 149794
ARM supports clz and ctz directly and both operations have well-defined
results for zero. There is no disadvantage in performance to using the
defined-at-zero versions of llvm.ctlz/cttz intrinsics. We're running into
ARM-specific code written with the assumption that __builtin_clz(0) == 32,
even though that value is technically undefined. The code is failing now
because of llvm optimizations that are taking advantage of the undef
behavior (specifically svn r147255). There's nothing wrong with that
optimization on x86 where any incorrect assumptions about __builtin_clz(0)
will quickly be exposed. For ARM, though, optimizations based on that undef
behavior are likely to cause subtle bugs. Other targets with defined-at-zero
clz/ctz support may want to override the default behavior as well.
llvm-svn: 149086
end up in the same output file as the layout stuff. There may even be
a race condition which is causing this output to confuse the FileCheck
in some cases. I actually don't know how on earth the parsing of the
layout file even works given that there are diagnostics in the middle of
it. ;]
llvm-svn: 149058
provide the layout of records, rather than letting Clang compute
the layout itself. LLDB provides the motivation for this feature:
because various layout-altering attributes (packed, aligned, etc.)
don't get reliably get placed into DWARF, the record layouts computed
by LLDB from the reconstructed records differ from the actual layouts,
and badness occurs. This interface lets the DWARF data drive layout,
so we don't need the attributes preserved to get the answer write.
The testing methodology for this change is fun. I've introduced a
variant of -fdump-record-layouts called -fdump-record-layouts-simple
that always has the simple C format and provides size/alignment/field
offsets. There is also a -cc1 option -foverride-record-layout=<file>
to take the output of -fdump-record-layouts-simple and parse it to
produce a set of overridden layouts, which is introduced into the AST
via a testing-only ExternalASTSource (called
LayoutOverrideSource). Each test contains a number of records to lay
out, which use various layout-changing attributes, and then dumps the
layouts. We then run the test again, using the preprocessor to
eliminate the layout-changing attributes entirely (which would give us
different layouts for the records), but supplying the
previously-computed record layouts. Finally, we diff the layouts
produced from the two runs to be sure that they are identical.
Note that this code makes the assumption that we don't *have* to
provide the offsets of bases or virtual bases to get the layout right,
because the alignment attributes don't affect it. I believe this
assumption holds, but if it does not, we can extend
LayoutOverrideSource to also provide base offset information.
Fixes the Clang side of <rdar://problem/10169539>.
llvm-svn: 149055
address safety analysis (such as e.g. AddressSanitizer or SAFECode) for a specific function.
When building with AddressSanitizer, add AddressSafety function attribute to every generated function
except for those that have __attribute__((no_address_safety_analysis)).
With this patch we will be able to
1. disable AddressSanitizer for a particular function
2. disable AddressSanitizer-hostile optimizations (such as some cases of load widening) when AddressSanitizer is on.
llvm-svn: 148842
- Add atomic-to/from-nonatomic cast types
- Emit atomic operations for arithmetic on atomic types
- Emit non-atomic stores for initialisation of atomic types, but atomic stores and loads for every other store / load
- Add a __atomic_init() intrinsic which does a non-atomic store to an _Atomic() type. This is needed for the corresponding C11 stdatomic.h function.
- Enables the relevant __has_feature() checks. The feature isn't 100% complete yet, but it's done enough that we want people testing it.
Still to do:
- Make the arithmetic operations on atomic types (e.g. Atomic(int) foo = 1; foo++;) use the correct LLVM intrinsic if one exists, not a loop with a cmpxchg.
- Add a signal fence builtin
- Properly set the fenv state in atomic operations on floating point values
- Correctly handle things like _Atomic(_Complex double) which are too large for an atomic cmpxchg on some platforms (this requires working out what 'correctly' means in this context)
- Fix the many remaining corner cases
llvm-svn: 148242
APValue::Array and APValue::MemberPointer. All APValue values can now be emitted
as constants.
Add new CGCXXABI entry point for emitting an APValue MemberPointer. The other
entrypoints dealing with constant member pointers are no longer necessary and
will be removed in a later change.
Switch codegen from using EvaluateAsRValue/EvaluateAsLValue to
VarDecl::evaluateValue. This performs caching and deals with the nasty cases in
C++11 where a non-const object's initializer can refer indirectly to
previously-initialized fields within the same object.
Building the intermediate APValue object incurs a measurable performance hit on
pathological testcases with huge initializer lists, so we continue to build IR
directly from the Expr nodes for array and record types outside of C++11.
llvm-svn: 148178
CFStrings writable.
The strings (both Unicode and ASCII) should reside in a read-only section. E.g.,
__TEXT,__cstring instead of __DATA,__data. This is done by making the global
variable created for the strings constant despite the value of that flag.
<rdar://problem/10657500>
llvm-svn: 147845
is inserted before the real argument. Padding is needed to ensure the backend
reads from or writes to the correct argument slots when the original alignment
of a byval structure is unavailable due to flattening.
llvm-svn: 147699