Unlike the standard AAPCS64 ABI, variadic arguments are always passed on the
stack with the Darwin ABI, and this was not being considered when deciding
whether to expand HFA/HVA arguments in a call. An HFA argument with a "float"
base type was being expanded into separate "float" arguments, each of which
was then extended to a double, resulting in a serious mismatch from what is
expected by the va_arg implementation. <rdar://problem/15777067>
llvm-svn: 206729
The frontend option -fno-optimize-sibling-calls resolves to -cc1's
-mdisable-tail-calls, which is passed to the TargetMachine in the
backend. PassManagerBuilder was adding the -tailcallelim pass anyway.
Use a new DisableTailCalls option in PassManagerBuilder to disable tail
calls harder.
Requires the matching commit in LLVM that adds DisableTailCalls.
<rdar://problem/16050591>
llvm-svn: 206543
My first attempt to make sure HFAs were contiguous was in the block dealing
with padding registers, which meant it only triggered on the first stack-based
HFA. This should extend it to the rest as well.
Another part of PR19432.
llvm-svn: 206456
This is a partial revert of 183015.
By not recognizing things like _setjmp we lose (returns_twice) attribute on
them, which leads to incorrect code generation.
Fixes PR16138.
llvm-svn: 206362
This implements clause C.8 of the AAPCS in the front-end, so that Clang
accurately knows when the registers run out and it has to insert padding before
the stack objects begin.
PR19432.
llvm-svn: 206296
This patch adds support for the msvc pragmas section, bss_seg, code_seg,
const_seg and data_seg as well as support for __declspec(allocate()).
Additionally it corrects semantics and adds diagnostics for
__attribute__((section())) and the interaction between the attribute
and the msvc pragmas and declspec. In general conflicts should now be
well diganosed within and among these features.
In supporting the pragmas new machinery for uniform lexing for
msvc pragmas was introduced. The new machinery always lexes the
entire pragma and stores it on an annotation token. The parser
is responsible for parsing the pragma when the handling the
annotation token.
There is a known outstanding bug in this implementation in C mode.
Because these attributes and pragmas apply _only_ to definitions, we
process them at the time we detect a definition. Due to tentative
definitions in C, we end up processing the definition late. This means
that in C mode, everything that ends up in a BSS section will end up in
the _last_ BSS section rather than the one that was live at the time of
tentative definition, even if that turns out to be the point of actual
definition. This issue is not known to impact anything as of yet
because we are not aware of a clear use or use case for #pragma bss_seg
but should be fixed at some point.
Differential Revision=http://reviews.llvm.org/D3065#inline-16241
llvm-svn: 205810
Summary:
MSVC always emits inline functions marked with the extern storage class
specifier. The result is something similar to the opposite of
__attribute__((gnu_inline)).
This extension is also available in C.
This fixes PR19264.
Reviewers: rnk, rsmith
CC: cfe-commits
Differential Revision: http://llvm-reviews.chandlerc.com/D3207
llvm-svn: 205485
This adds support for the various NEON intrinsics used by
aarch64-neon-intrinsics.c (originally written for AArch64) and enables the
test.
My implementations are designed to be semantically correct, the actual code
quality looks like its a wash between the two backends, and is frequently
different (hence the large number of CHECK changes).
llvm-svn: 205210
Really, all tests outside of the Driver tree should use %clang_cc1, but
these are new and easy to fix, and many of them use buitlin headers
which don't work as well without using %clang_cc1.
llvm-svn: 205147
At least on REL6 (Linux/glibc 2.12), the proper symbol for generating gprof
data is _mcount, not mcount. Prior to this change, compiling with -pg would
generate linking errors (because of unresolved references to mcount), after
this change -pg seems at least minimally functional.
llvm-svn: 205144
This adds Clang support for the ARM64 backend. There are definitely
still some rough edges, so please bring up any issues you see with
this patch.
As with the LLVM commit though, we think it'll be more useful for
merging with AArch64 from within the tree.
llvm-svn: 205100
The peculiarities of C99 create scenario where an LLVM IR function
declaration may need to be replaced with a definition baring a different
type because the prototype and definition are not required to agree.
However, we were not properly deferring this when it occurred.
This fixes PR19280.
llvm-svn: 205099
This produces valid IR now that llvm rejects aliases to weak aliases and warns
the user that the resolution is not changed if the weak alias is overridden.
llvm-svn: 204935
When parsing MS inline assembly, we note that fpsw is an implicit def of
most x87 FP operations, and add it to the clobber list. However, we
don't recognize fpsw as a gcc register name, and we assert. Clang
always adds an fpsr clobber, which means the same thing to LLVM, so we
can just use that.
This test case was broken by my LLVM change r196939.
Reviewers: echristo
Differential Revision: http://llvm-reviews.chandlerc.com/D2993
llvm-svn: 204878
This commit fixes a cast instruction assertion failure
due to the incompatible type cast. This will only happen when
the target requires atomic libcalls.
llvm-svn: 204834
COFF doesn't have mergeable sections so LLVM/clang's normal tactics for
string deduplication will not have any effect.
To remedy this we place each string inside it's own section and mark
the section as IMAGE_COMDAT_SELECT_ANY. However, we can only do this if the
string has an external name that we can generate from it's contents.
To be compatible with MSVC, we must use their scheme. Otherwise identical
strings in translation units from clang may not be deduplicated with
translation units in MSVC.
This fixes PR18248.
N.B. We will not attempt to do anything with a string literal which is not of
type 'char' or 'wchar_t' because their compiler does not support unicode
string literals as of this date. Further, we avoid doing this if
either -fwritable-strings or -fsanitize=address are present.
This reverts commit r204596.
llvm-svn: 204675
Use two check-prefix patterns per FileCheck invocation for these tests,
this cleanly removes redundant CHECK directives.
Thanks to Richard Smith for the idea!
llvm-svn: 204587
COFF doesn't have mergeable sections so LLVM/clang's normal tactics for
string deduplication will not have any effect.
To remedy this we place each string inside it's own section and mark
the section as IMAGE_COMDAT_SELECT_ANY. However, we can only do this if the
string has an external name that we can generate from it's contents.
To be compatible with MSVC, we must use their scheme. Otherwise identical
strings in translation units from clang may not be deduplicated with
translation units in MSVC.
This fixes PR18248.
N.B. We will not attempt to do anything with a string literal which is not of
type 'char' or 'wchar_t' because their compiler does not support unicode
string literals as of this date.
llvm-svn: 204562
This makes Clang take advantage of the recent IR addition of a
"failure" memory ordering requirement. As with the "success" ordering,
we try to emit just a single version if the expression is constant,
but fall back to runtime detection (to allow optimisation across
function-call boundaries).
rdar://problem/15996804
llvm-svn: 203837
When a struct has bitfields overlapping with other members
(as required by the AAPCS), clang uses a packed struct to
represent this. If such a struct is large enough for clang to
pass it as a byval pointer (>64 bytes), we need to set the
alignment of the argument to match the original type.
llvm-svn: 203660
This is a conservative check, because it's valid for the expression to be
non-constant, and in cases like that we just don't know whether it's valid.
rdar://problem/16242991
llvm-svn: 203561
These tests are logically related, but they're spread about several
different CodeGen directories. Consolidate them in one place to make
them easier to manage.
llvm-svn: 203541
Summary:
'Expected' should only be modified if the operation fails.
This fixes PR18899.
Reviewers: chandlerc, rsmith, rjmccall
CC: cfe-commits
Differential Revision: http://llvm-reviews.chandlerc.com/D2922
llvm-svn: 203493
These tests were added before we had settled on using a .profdata extension
for the profile data files. Renaming them now for consistency.
llvm-svn: 203166
In addition, for all functions, use the name from the llvm::Function to
identify the function in the profile data. Compute that "function name",
including the file name for local functions, once when assigning the PGO
counters and store it in the CodeGenPGO class.
Move the code to add InlineHint and Cold attributes out of StartFunction(),
because the "function name" string isn't available at that point.
llvm-svn: 203075
This adds support for the PPC "wc" inline asm constraint (used for allocating
individual CR bits). Support for this constraint type was recently added to the
LLVM PowerPC backend. Although gcc does not currently support allocating
individual CR bits, this identifier choice has been coordinated with the gcc
PowerPC team, and will be marked as reserved for this purpose in the gcc
constraints.md file.
Prior to this change, none of the multi-character PPC constraints were handled
correctly (the '^' escape character was not being added as required by the
parsing code in LLVM). This should now be fixed. I'll add tests for these other
constraints as support is added for them in the backend.
llvm-svn: 202658
When lowering a bitfield, CGRecordLowering would assign the wrong
storage type to a bitfield in some cases and trigger an assertion. In
these cases the layout was still correct, just the bitfield info was
wrong.
llvm-svn: 202562
CGRecordLayoutBuilder was aging, complex, multi-pass, and shows signs of
existing before ASTRecordLayoutBuilder. It redundantly performed many
layout operations that are now performed by ASTRecordLayoutBuilder and
asserted that the results were the same. With the addition of support
for the MS-ABI, such as placement of vbptrs, vtordisps, different
bitfield layout and a variety of other features, CGRecordLayoutBuilder
was growing unwieldy in its redundancy.
This patch re-architects CGRecordLayoutBuilder to not perform any
redundant layout but rather, as directly as possible, lower an
ASTRecordLayout to an llvm::type. The new architecture is significantly
smaller and simpler than the CGRecordLayoutBuilder and contains fewer
ABI-specific code paths. It's also one pass.
The architecture of the new system is described in the comments. For the
most part, the new system simply takes all of the fields and bases from
an ASTRecordLayout, sorts them, inserts padding and dumps a record.
Bitfields, unions and primary virtual bases make this process a bit more
complicated. See the inline comments.
In addition, this patch updates a few lit tests due to the fact that the
new system computes more accurate llvm types than CGRecordLayoutBuilder.
Each change is commented individually in the review.
Differential Revision: http://llvm-reviews.chandlerc.com/D2795
llvm-svn: 201907
This breaks backwards compatibility with existing code. Previously, this
was defined as
#define _mm_prefetch(a, sel) (__builtin_prefetch((void *)(a), 0, (sel)))
Which basically accepts any pointer. Changing this to char* simply
breaks a lot of existing code. I have tried changing char* to
"const void*", which seems to be the right thing as per Intel
specification this should work on basically any pointer. However,
apparently this breaks windows compatibility (because of a conflicting
declaration in windows.h).
So, we probably need to #ifdef this based on whether clang is compiling
for windows. According to Chandler, this might be done by introducing an
additional symbol to a fake type in BuiltinsX86.def and then condition
the type expansion on the platform.
llvm-svn: 201775
This patch adds several built-ins that are required for ms
compatibility. _mm_prefetch must be a built-in because it takes a
compile-time constant argument and our prior approach of using a #define
to the current built-in doesn't work in the presence of re-declaration
of _mm_prefetch. The others can be obtained by including the windows
system headers. If a user includes the windows system headers but not
intrin.h they still need to work and therefore must be built-in because
we don't get a chance to implement them in intrin.h in this case.
llvm-svn: 201734
This fixes one immediate bug where an expression with side-effects
could be emitted twice during a NEON call.
It also prepares the way for folding CodeGen for many of the SISD
intrinsics into a table, reducing code size and hopefully increasing
performance eventually ("binary search + few switch cases" should be
better than "lots of switch cases").
llvm-svn: 201667
These instructions (well, the f32 ones) are supported on 32-bit ARMv8, not just
AArch64. Now that the arm_neon.td refactoring is complete, adding them is
surprisingly simple.
rdar://problem/16035743
llvm-svn: 201661
Previously, we made one traversal of the AST prior to codegen to assign
counters to the ASTs and then propagated the count values during codegen. This
patch now adds a separate AST traversal prior to codegen for the
-fprofile-instr-use option to propagate the count values. The counts are then
saved in a map from which they can be retrieved during codegen.
This new approach has several advantages:
1. It gets rid of a lot of extra PGO-related code that had previously been
added to codegen.
2. It fixes a serious bug. My original implementation (which was mailed to the
list but never committed) used 3 counters for every loop. Justin improved it to
move 2 of those counters into the less-frequently executed breaks and continues,
but that turned out to produce wrong count values in some cases. The solution
requires visiting a loop body before the condition so that the count for the
condition properly includes the break and continue counts. Changing codegen to
visit a loop body first would be a fairly invasive change, but with a separate
AST traversal, it is easy to control the order of traversal. I've added a
testcase (provided by Justin) to make sure this works correctly.
3. It improves the instrumentation overhead, reducing the number of counters for
a loop from 3 to 1. We no longer need dedicated counters for breaks and
continues, since we can just use the propagated count values when visiting
breaks and continues.
To make this work, I needed to make a change to the way we count case
statements, going back to my original approach of not including the fall-through
in the counter values. This was necessary because there isn't always an AST node
that can be used to record the fall-through count. Now case statements are
handled the same as default statements, with the fall-through paths branching
over the counter increments. While I was at it, I also went back to using this
approach for do-loops -- omitting the fall-through count into the loop body
simplifies some of the calculations and make them behave the same as other
loops. Whenever we start using this instrumentation for coverage, we'll need
to add the fall-through counts into the counter values.
llvm-svn: 201528
When a function has a single counter, we will offset the pointer by 1 when
parsing the next function. If a function has multiple counters, we are
okay after skipping rest of the counters.
llvm-svn: 201456
Summary:
AsmPrinter::EmitInlineAsm() will no longer use the EmitRawText() call for
targets with mature MC support. Such targets will always parse the inline
assembly (even when emitting assembly). Targets without mature MC support
continue to use EmitRawText() for assembly output.
The hasRawTextSupport() check in AsmPrinter::EmitInlineAsm() has been replaced
with MCAsmInfo::UseIntegratedAs which when true, causes the integrated assembler
to parse inline assembly (even when emitting assembly output). UseIntegratedAs
is set to true for targets that consider any failure to parse valid assembly
to be a bug. Target specific subclasses generally enable the integrated
assembler in their constructor. The default value can be overridden with
-no-integrated-as.
All tests that rely on inline assembly supporting invalid assembly (for example,
those that use mnemonics such as 'foo' or 'hello world') have been updated to
disable the integrated assembler.
Changes since review (and last commit attempt):
- Fixed test failures that were missed due to configuration of local build.
(fixes crash.ll and a couple others).
- Fixed tests that happened to pass because the local build was on X86
(should fix 2007-12-17-InvokeAsm.ll)
- mature-mc-support.ll's should no longer require all targets to be compiled.
(should fix ARM and PPC buildbots)
- Object output (-filetype=obj and similar) now forces the integrated assembler
to be enabled regardless of default setting or -no-integrated-as.
(should fix SystemZ buildbots)
Reviewers: rafael
Reviewed By: rafael
CC: llvm-commits
Differential Revision: http://llvm-reviews.chandlerc.com/D2686
llvm-svn: 201333
useBitFieldTypeAlignment() and appears to ignore the special
bit-packing semantics of __attribute__((packed)).
Further flesh out an already-extensive comment.
llvm-svn: 201282
This test case doesn't belong in Clang (it's testing IndVarSimplify) but
in an effort to reproduce the test case this was intended to cover (by
essentially reverting r134441) I wasn't able to reproduce the failure
this test case should've produced. So I haven't ported this down to
LLVM, instead I'm just deleting it.
I suspect the test is just underconstrained, but I've no great interest
in trying hard to fix it right now - if anyone else wants to, I'd be
more than welcome to that.
llvm-svn: 201178
Xcore target ABI requires const data that is externally visible
to be handled differently if it has C-language linkage rather than
C++ language linkage.
llvm-svn: 201142
According to the AAPCS, we can split structs between GPRs and the stack,
except for when an argument has already been allocated on the stack. This
can occur when a large number of floating-point arguments fill up the VFP
registers, and are alllocated on the stack before the general-purpose argument
registers are full.
llvm-svn: 201137
This option has the following effects:
* It adds the sspstrong IR attribute to each function within the CU.
* It defines the macro __SSP_STRONG__ with the value of 2.
Differential Revision: http://llvm-reviews.chandlerc.com/D2717
llvm-svn: 201120
An HFA is defined as a struct containing floating point values of the
same machine type. In the 32-bit ABI, double and long double have the
same machine type, so a struct with a mixture of these types must be an
HFA (assuming it meets the other criteria).
llvm-svn: 200971
We collect a maximal function count among all functions in the pgo data file.
For functions that are hot, we set its InlineHint attribute. For functions that
are cold, we set its Cold attribute.
We currently treat functions with >= 30% of the maximal function count as hot
and functions with <= 1% of the maximal function count are treated as cold.
These two numbers are from preliminary tuning on SPEC.
This commit should not affect non-PGO builds and should boost performance on
instrumentation based PGO.
llvm-svn: 200874
Arguments and return values must always be marshalled as for the base
AAPCS when the callee is a variadic function.
Patch by Oliver Stannard!
llvm-svn: 200307
Due to statement expressions supported as GCC extension, it is possible
to put 'break' or 'continue' into a loop/switch statement but outside
its body, for example:
for ( ; ({ if (first) { first = 0; continue; } 0; }); )
This code is rejected by GCC if compiled in C mode but is accepted in C++
code. GCC bug 44715 tracks this discrepancy. Clang used code generation
that differs from GCC in both modes: only statement of the third
expression of 'for' behaves as if it was inside loop body.
This change makes code generation more close to GCC, considering 'break'
or 'continue' statement in condition and increment expressions of a
loop as it was inside the loop body. It also adds error for the cases
when 'break'/'continue' appear outside loop due to this syntax. If
code generation differ from GCC, warning is issued.
Differential Revision: http://llvm-reviews.chandlerc.com/D2518
llvm-svn: 199897
PNaCl and Emscripten can both handle va_arg IR instructions with
struct type.
Also add a test to cover generating a va_arg IR instruction from
va_arg in C on le32 (as already handled by VisitVAArgExpr() in
CGExprScalar.cpp), which was not covered by a test before.
(This fixes https://code.google.com/p/nativeclient/issues/detail?id=2381)
Differential Revision: http://llvm-reviews.chandlerc.com/D2539
llvm-svn: 199830
of the current compilation unit.
As a side effect this enables many more LTO uniquing opportunities.
This reapplies r199757 with a better testcase.
llvm-svn: 199760
Without them they can be merged with non unnamed_addr constants during LTO.
The resulting constant is not unnamed_addr and goes in a different section,
which causes ld64 to crash.
A testcase that would crash before:
* file1.mm:
void g(id notification) {
[notification valueForKey:@"name"];
}
* file2.cpp:
extern const char js_name_str[] = "name";
* file3.cpp
extern bool JS_GetProperty(const char *name);
extern const char js_name_str[];
bool js_ReportUncaughtException() { JS_GetProperty(js_name_str); }
run
clang file1.mm -o file1.o -c -w -emit-llvm
clang file2.cpp -o file2.o -c -w -emit-llvm
clang file3.cpp -o file3.o -c -w
ld -dylib -o XUL file1.o file2.o file3.o -undefined dynamic_lookup.
llvm-svn: 199688
marked as AlwaysInline or ForceInline.
This moves us to what gcc does with -fno-inline. The attribute approach
was discussed to be better than switching to InlineAlways inliner in presence
of LTO.
llvm-svn: 199324
a subprocess invocation which is pretty significant on Windows. It also
likely saves a bunch of thrashing the host machine needlessly. Finally
it makes the tests much more predictable and less dependent on the host.
For example 'header_lookup1.c' was passing '-fno-ms-extensions' just to
thwart the host detection adding it into the compilation. By runnig CC1
directly we don't have to deal with such oddities.
llvm-svn: 199308
This makes the C++ ABI depend entirely on the target: MS ABI for -win32 triples,
Itanium otherwise. It's no longer possible to do weird combinations.
To be able to run a test with a specific ABI without constraining it to a
specific triple, new substitutions are added to lit: %itanium_abi_triple and
%ms_abi_triple can be used to get the current target triple adjusted to the
desired ABI. For example, if the test suite is running with the i686-pc-win32
target, %itanium_abi_triple will expand to i686-pc-mingw32.
Differential Revision: http://llvm-reviews.chandlerc.com/D2545
llvm-svn: 199250
These functions have the same constness properties of the normal libm
functions, which allows LLVM to optimise code better in general. There
are also a couple of specific optimisations that only trigger when
these are properly marked.
rdar://problem/13729466
llvm-svn: 199249
With the old linkage types removed, set the linkage to external for both
dllimport and dllexport to reflect what's currently supported.
llvm-svn: 199220
In preparation for making the Win32 triple imply MS ABI mode,
make all tests pass in this mode, or make them use the Itanium
mode explicitly.
Differential Revision: http://llvm-reviews.chandlerc.com/D2401
llvm-svn: 199130
Right now clang produces the same DataLayout for all of them, but it could, for
example, add 'n' specifications when the end architecture is given.
No functionality change, this should just make future changes easier to read.
llvm-svn: 197549
This has no functionality change as clang adds explicit alignment info for
byval arguments. The only difference is that now the clang produced
DataLayout string for AArch64 is identical to the LLVM produced one.
llvm-svn: 197538
This completes the cleanup/refactoring of DataLayout on the clang side. Next
is figuring out the differences between the llvm and clang produced strings
llvm-svn: 197442