LLVM currently has a hack (shouldEmitUsedDirectiveFor) that causes it to not
print no_dead_strip for symbols starting with 'l' or 'L'. These are exactly the
ones that the clang's objc codegen is producing. The net result, is that it is
equivalent to llvm.compiler.used.
The need for putting the private symbol in llvm.compiler.used should be clear
(the objc runtime uses them). The reason for also putting the weak symbols in
it is for LTO: ld64 will not ask us to preserve the it.
llvm-svn: 203172
In addition, for all functions, use the name from the llvm::Function to
identify the function in the profile data. Compute that "function name",
including the file name for local functions, once when assigning the PGO
counters and store it in the CodeGenPGO class.
Move the code to add InlineHint and Cold attributes out of StartFunction(),
because the "function name" string isn't available at that point.
llvm-svn: 203075
For C++ functions, we will continue to use the mangled name to identify
functions in the PGO profile data, but this name is confusing for things like
Objective-C methods. For functions with local linkage, we're also going to
include the file name to help distinguish those functions, so this changes to
use more generic variable names.
No functional changes.
llvm-svn: 203074
Move the PGO.assignRegionCounters() call out of StartFunction, because that
function is called from many places where it does not make sense to do PGO
instrumentation (e.g., compiler-generated helper functions). Change several
functions to take a StringRef argument for the unique name associated with
a function, so that the name can be set differently for things like Objective-C
methods and block literals.
llvm-svn: 203073
I hit this while debugging another issue where my sources were in an
inconsistent state, so I don't have a testcase. Regardless, this check is
simpler and more direct than checking if the option is enabled.
llvm-svn: 203072
This patch changes the remaining GlobalVariables using "\01L" and
"\01l" prefixes to use private linkage. What is strange about them is
that they currently use WeakAnyLinkage. There is no comment stating
why and that is really odd since the symbols are completely hidden, so
it doesn't make sense for them to be weak.
Clang revisions like r63329, r63408, r63770, r65761 set the linkage to
weak, but don't say why. I suspect they were just copying llvm-gcc.
In llvm-gcc I found r58599 and r56322 that set DECL_WEAK, but they
were just syncing from the apple gcc. I am not exactly sure what that
means, since the last commit to
svn://gcc.gnu.org/svn/gcc/branches/apple was in 2006, 2 years earlier.
In summary, I have no idea why weak linkage was being used :-(
To quote John McCall, "Let’s try without it and see" :-)
llvm-svn: 203059
Summary:
The MSVC ABI appears to mangle the lexical scope into the names of
statics. Specifically, a counter is incremented whenever a scope is
entered where things can be declared in such a way that an ambiguity can
arise. For example, a class scope inside of a class scope doesn't do
anything interesting because the nested class cannot collide with
another nested class.
There are problems with this scheme:
- It is unreliable. The counter is only incremented when a previously
never encountered scope is entered. There are cases where this will
cause ambiguity amongst declarations that have the same name where one
was introduced in a deep scope while the other was introduced right
after in the previous lexical scope.
- It is wasteful. Statements like: {{{{{{{ static int foo = a; }}}}}}}
will make the mangling of "foo" larger than it need be because the
scope counter has been incremented many times.
Because of these problems, and practical implementation concerns. We
choose not to implement this scheme if the local static or local type
isn't visible. The mangling of these declarations will look very
similar but the numbering will make far more sense, this scheme is
lifted from the Itanium ABI implementation.
Reviewers: rsmith, doug.gregor, rnk, eli.friedman, cdavis5x
Reviewed By: rnk
CC: cfe-commits
Differential Revision: http://llvm-reviews.chandlerc.com/D2953
llvm-svn: 202951
This fails an "isa<> used with null pointer" assert during a clang-cl
self-host on Windows. This was caused by r202769, and I'm currently
reducing a test case.
Reviewers: dblaikie
Differential Revision: http://llvm-reviews.chandlerc.com/D2944
llvm-svn: 202888
if an ivar offset load is invariant iff inside an instance method
and ivar belongs to instance method's class and one of its super class.
// rdar://16095748
llvm-svn: 202872
We should only have this optimization fire when the explicit
instantiation definition would cause at least one member function to be
emitted, thus ensuring that even a compiler not performing this
optimization would still emit the full type information elsewhere.
But we should also pessimize output still by always emitting the
definition when the explicit instantiation definition appears so that at
some point in the future we can depend on that information even when no
code had to be emitted in that TU. (this shouldn't happen very often,
since people mostly use explicit spec decl/defs to reduce code size -
but perhaps one day they could use it to explicitly reduce debug info
size too)
This was worth about 2% for Clang and LLVM - so not a huge win, but a
win. It looks really great for simple STL programs (include <string> and
just declare a string - 14k -> 1.4k of .dwo)
llvm-svn: 202769
When lowering a bitfield, CGRecordLowering would assign the wrong
storage type to a bitfield in some cases and trigger an assertion. In
these cases the layout was still correct, just the bitfield info was
wrong.
llvm-svn: 202562
A 'remark' is information that is not an error or a warning, but rather some
additional information provided to the user. In contrast to a 'note' a 'remark'
is an independent diagnostic, whereas a 'note' always depends on another
diagnostic.
A typical use case for remark nodes is information provided to the user, e.g.
information provided by the vectorizer about loops that have been vectorized.
This patch provides the initial implementation of 'remarks'. It includes the
actual definiton of the remark nodes, their printing as well as basic parameter
handling. We are reusing the existing diagnostic parameters which means a remark
can be enabled with normal '-Wdiagnostic-name' flags and can be upgraded to
an error using '-Werror=diagnostic-name'. '-Werror' alone does not upgrade
remarks.
This patch is by intention minimal in terms of parameter handling. More
experience and more discussions will most likely lead to further enhancements
in the parameter handling.
llvm-svn: 202475
In r201528, I changed the PGO instrumentation counter for a "do" loop to not
include the fall-through count. That fall-through count is included later, b
it means that this assertion may fail for "do" loops.
llvm-svn: 202437
Summary:
This merges VFPtrInfo and VBTableInfo into VPtrInfo, since they hold
almost the same information. With that change, the vbtable mangling
code can easily be applied to vftable data and we magically get the
correct, unambiguous vftable names.
Fixes PR17748.
Reviewers: timurrrr, majnemer
CC: cfe-commits
Differential Revision: http://llvm-reviews.chandlerc.com/D2893
llvm-svn: 202425
In llvm the only semantic difference between internal and private is that llvm
tries to hide private globals my mangling them with a private prefix. Since
the globals changed by this patch already had the magic don't mangle marker,
there should be no change in the generated assembly.
A followup patch should then be able to drop the \01L and \01l prefixes and let
llvm mangle as appropriate.
llvm-svn: 202419
Clang is using llvm::StructType::isOpaque() as a way of signaling if
we've finished record type conversion in
CodeGenTypes::isRecordLayoutComplete(). However, Clang was setting the
body of the type before it finished laying out the type as a base type.
Laying out the %class.C.base LLVM type attempts to convert more types,
eventually recursively attempting to layout 'C' again, at which point we
would say that layout was complete, even though we were still in the
middle of it.
By not setting the body, we correctly signal that layout is not
complete, and things work as expected.
At some point, it might be worth refactoring this to avoid looking at
the LLVM IR types under construction.
llvm-svn: 202320
Before this patch the globals were created with the wrong linkage and patched
afterwards. From the comments it looks like something would complain about
having an internal GV with no initializer. At least in clang the verifier will
only run way after we set the initializer, so that is not a problem.
This patch should be a nop. It just figures out the linkage earlier and
converts the old calls to setLinkage to asserts. The only case where that is
not possible is when we first see a weak import that is then implemented. In
that case we have to change the linkage, but that is the only setLinkage left.
llvm-svn: 202305
The 'f' modifier is designed for integer type arguments really (according to
its documentation). It's better to use the "half width, same number" modifier.
Should be no user-visible change.
llvm-svn: 202152
The __forceinline keyword's semantics are now recast as AlwaysInline and
the kw___forceinline token has its language mode set for KEYMS.
This preserves the semantics of the previous implementation but with
less duplication of code.
llvm-svn: 202131
Take advantage of CharUnits::alignmentAtOffset instead of calculating it
by hand.
Differential Revision: http://llvm-reviews.chandlerc.com/D2862
llvm-svn: 202098
Previously the X86 backend would look for the sret attribute and handle
this for us. inalloca takes that all away, so we have to do the return
ourselves now.
llvm-svn: 202097
We still don't use the PGO to set branch weights for these loops, but at
least this keeps the compiler from crashing. <rdar://problem/16137778>
llvm-svn: 202002
CGRecordLayoutBuilder was aging, complex, multi-pass, and shows signs of
existing before ASTRecordLayoutBuilder. It redundantly performed many
layout operations that are now performed by ASTRecordLayoutBuilder and
asserted that the results were the same. With the addition of support
for the MS-ABI, such as placement of vbptrs, vtordisps, different
bitfield layout and a variety of other features, CGRecordLayoutBuilder
was growing unwieldy in its redundancy.
This patch re-architects CGRecordLayoutBuilder to not perform any
redundant layout but rather, as directly as possible, lower an
ASTRecordLayout to an llvm::type. The new architecture is significantly
smaller and simpler than the CGRecordLayoutBuilder and contains fewer
ABI-specific code paths. It's also one pass.
The architecture of the new system is described in the comments. For the
most part, the new system simply takes all of the fields and bases from
an ASTRecordLayout, sorts them, inserts padding and dumps a record.
Bitfields, unions and primary virtual bases make this process a bit more
complicated. See the inline comments.
In addition, this patch updates a few lit tests due to the fact that the
new system computes more accurate llvm types than CGRecordLayoutBuilder.
Each change is commented individually in the review.
Differential Revision: http://llvm-reviews.chandlerc.com/D2795
llvm-svn: 201907
Because GCC incorrectly defines _mm_prefetch to take anything that casts
to void*, people have started using that behavior. The previous patch
that made _mm_prefetch actually take a const char * broke compatibility
with existing code. This update to the patch leaves the macro that
defines _mm_prefetch with the (void*) cast when _MSC_VER is not defined.
llvm-svn: 201901
This extends the intrinsic lookup table format slightly, and adds
entries for use the shared ARM/AArch64 definitions. The benefit is
currently smaller than for the SISD intrinsics (there's more custom
code implementing this set), but a few lines are saved and there's
scope for future expansion.
llvm-svn: 201848
This extracts the table-driven intrinsic lookup phase into a separate
function, to be used by EmitCommonNeonBuiltinExpr soon.
It also simplifies the logic used in that lookup, since VectorCastArgN
and ScalarArgN were actually identical.
llvm-svn: 201847
This does;
- clang_tablegen() adds each tblgen'd target to global property CLANG_TABLEGEN_TARGETS as list.
- List of targets is added to LLVM_COMMON_DEPENDS.
- all clang libraries and targets depend on generated headers.
You might wonder this would be regression, but in fact, this is little loss.
- Almost all of clang libraries depend on tblgen'd files and clang-tblgen.
- clang-tblgen may cause short stall-out but doesn't cause unconditional rebuild.
- Each library's dependencies to tblgen'd files might vary along headers' structure.
It made hard to track and update *really optimal* dependencies.
Each dependency to intrinsics_gen and ClangSACheckers is left as DEPENDS.
llvm-svn: 201842
Virtual methods expect 'this' to point to the vfptr containing the
virtual method, and this extends to virtual member pointer thunks. The
relevant vfptr is always at offset zero on entry to the thunk, and no
this adjustment is needed.
Previously we would not include the vfptr adjustment in the member
pointer, and we'd look at the vfptr offset when loading from the vftable
in the thunk.
Fixes PR18917.
llvm-svn: 201835
The MS ABI requires that we determine the vbptr offset if have a
virtual inheritance model. Instead, raise an error pointing to the
diagnostic when this happens.
This fixes PR18583.
Differential Revision: http://llvm-reviews.chandlerc.com/D2842
llvm-svn: 201824
This breaks backwards compatibility with existing code. Previously, this
was defined as
#define _mm_prefetch(a, sel) (__builtin_prefetch((void *)(a), 0, (sel)))
Which basically accepts any pointer. Changing this to char* simply
breaks a lot of existing code. I have tried changing char* to
"const void*", which seems to be the right thing as per Intel
specification this should work on basically any pointer. However,
apparently this breaks windows compatibility (because of a conflicting
declaration in windows.h).
So, we probably need to #ifdef this based on whether clang is compiling
for windows. According to Chandler, this might be done by introducing an
additional symbol to a fake type in BuiltinsX86.def and then condition
the type expansion on the platform.
llvm-svn: 201775
This patch adds several built-ins that are required for ms
compatibility. _mm_prefetch must be a built-in because it takes a
compile-time constant argument and our prior approach of using a #define
to the current built-in doesn't work in the presence of re-declaration
of _mm_prefetch. The others can be obtained by including the windows
system headers. If a user includes the windows system headers but not
intrin.h they still need to work and therefore must be built-in because
we don't get a chance to implement them in intrin.h in this case.
llvm-svn: 201734
This fixes one immediate bug where an expression with side-effects
could be emitted twice during a NEON call.
It also prepares the way for folding CodeGen for many of the SISD
intrinsics into a table, reducing code size and hopefully increasing
performance eventually ("binary search + few switch cases" should be
better than "lots of switch cases").
llvm-svn: 201667
These instructions (well, the f32 ones) are supported on 32-bit ARMv8, not just
AArch64. Now that the arm_neon.td refactoring is complete, adding them is
surprisingly simple.
rdar://problem/16035743
llvm-svn: 201661
Summary:
Generally the vector deleting dtor, which we model as a vtable thunk,
takes care of non-virtual adjustment and delegates to the other
destructor variants. The other non-complete destructor variants assume
that 'this' on entry points to the virtual base subobject that first
declared the virtual destructor.
We need to change the adjustment in both the prologue and the vdtor call
setup.
Reviewers: timurrrr
CC: cfe-commits
Differential Revision: http://llvm-reviews.chandlerc.com/D2821
llvm-svn: 201612
Previously, we made one traversal of the AST prior to codegen to assign
counters to the ASTs and then propagated the count values during codegen. This
patch now adds a separate AST traversal prior to codegen for the
-fprofile-instr-use option to propagate the count values. The counts are then
saved in a map from which they can be retrieved during codegen.
This new approach has several advantages:
1. It gets rid of a lot of extra PGO-related code that had previously been
added to codegen.
2. It fixes a serious bug. My original implementation (which was mailed to the
list but never committed) used 3 counters for every loop. Justin improved it to
move 2 of those counters into the less-frequently executed breaks and continues,
but that turned out to produce wrong count values in some cases. The solution
requires visiting a loop body before the condition so that the count for the
condition properly includes the break and continue counts. Changing codegen to
visit a loop body first would be a fairly invasive change, but with a separate
AST traversal, it is easy to control the order of traversal. I've added a
testcase (provided by Justin) to make sure this works correctly.
3. It improves the instrumentation overhead, reducing the number of counters for
a loop from 3 to 1. We no longer need dedicated counters for breaks and
continues, since we can just use the propagated count values when visiting
breaks and continues.
To make this work, I needed to make a change to the way we count case
statements, going back to my original approach of not including the fall-through
in the counter values. This was necessary because there isn't always an AST node
that can be used to record the fall-through count. Now case statements are
handled the same as default statements, with the fall-through paths branching
over the counter increments. While I was at it, I also went back to using this
approach for do-loops -- omitting the fall-through count into the loop body
simplifies some of the calculations and make them behave the same as other
loops. Whenever we start using this instrumentation for coverage, we'll need
to add the fall-through counts into the counter values.
llvm-svn: 201528
When a function has a single counter, we will offset the pointer by 1 when
parsing the next function. If a function has multiple counters, we are
okay after skipping rest of the counters.
llvm-svn: 201456
properties by fixing shouldBindAsLValue to accept arrays
(like record types) because we always manipulate
them in memory. Patch suggested by John MaCall.
// rdar://15610943
llvm-svn: 201428
This changes ARM to use @llvm.fabs for floating-point vabs. Patterns
already existed in the backend, and it might help mid-end phases since
it's more likely to be understood than @llvm.arm.neon.vabs.
llvm-svn: 201313
The s64/u64 vcvt conversion operations are actually pretty much identical to
the s32/u32 ones in implementation, and can be shared with just one extra
variable.
llvm-svn: 201145
Xcore target ABI requires const data that is externally visible
to be handled differently if it has C-language linkage rather than
C++ language linkage.
llvm-svn: 201142
According to the AAPCS, we can split structs between GPRs and the stack,
except for when an argument has already been allocated on the stack. This
can occur when a large number of floating-point arguments fill up the VFP
registers, and are alllocated on the stack before the general-purpose argument
registers are full.
llvm-svn: 201137
This option has the following effects:
* It adds the sspstrong IR attribute to each function within the CU.
* It defines the macro __SSP_STRONG__ with the value of 2.
Differential Revision: http://llvm-reviews.chandlerc.com/D2717
llvm-svn: 201120
Now that both ARM backends use the same implementation for vshll operations,
the code can be shared. This is also a necessary LLVM/Clang interface update.
llvm-svn: 201094
Now the backend supports the natural LLVM IR, we can shamelessly steal the
AArch64 front-end code to implement the vshrn intrinsic on 32-bit ARM.
llvm-svn: 201086
unique them and permits the implementation of dynamic_cast (and
anything else which knows it's working with a complete class
type) to compare their addresses directly.
rdar://16005328
llvm-svn: 201020