Commit Graph

470 Commits

Author SHA1 Message Date
Alexey Samsonov a0ac3c2bf0 [ASan] Improve blacklisting of global variables.
This commit changes the way we blacklist global variables in ASan.
Now the global is excluded from instrumentation (either regular
bounds checking, or initialization-order checking) if:

1) Global is explicitly blacklisted by its mangled name.
This part is left unchanged.

2) SourceLocation of a global is in blacklisted source file.
This changes the old behavior, where instead of looking at the
SourceLocation of a variable we simply considered llvm::Module
identifier. This was wrong, as identifier may not correspond to
the file name, and we incorrectly disabled instrumentation
for globals coming from #include'd files.

3) Global is blacklisted by type.
Now we build the type of a global variable using Clang machinery
(QualType::getAsString()), instead of llvm::StructType::getName().

After this commit, the active users of ASan blacklist files
may have to revisit them (this is a backwards-incompatible change).

llvm-svn: 220097
2014-10-17 22:37:33 +00:00
Alexey Samsonov 1444bb9fc8 SanitizerBlacklist: blacklist functions by their source location.
This commit changes the way we blacklist functions in ASan, TSan,
MSan and UBSan. We used to treat function as "blacklisted"
and turned off instrumentation in it in two cases:

1) Function is explicitly blacklisted by its mangled name.
This part is not changed.

2) Function is located in llvm::Module, whose identifier is
contained in the list of blacklisted sources. This is completely
wrong, as llvm::Module may not correspond to the actual source
file function is defined in. Also, function can be defined in
a header, in which case user had to blacklist the .cpp file
this header was #include'd into, not the header itself.
Such functions could cause other problems - for instance, if the
header was included in multiple source files, compiled
separately and linked into a single executable, we could end up
with both instrumented and non-instrumented version of the same
function participating in the same link.

After this change we will make blacklisting decision based on
the SourceLocation of a function definition. If a function is
not explicitly defined in the source file, (for example, the
function is compiler-generated and responsible for
initialization/destruction of a global variable), then it will
be blacklisted if the corresponding global variable is defined
in blacklisted source file, and will be instrumented otherwise.

After this commit, the active users of blacklist files may have
to revisit them. This is a backwards-incompatible change, but
I don't think it's possible or makes sense to support the
old incorrect behavior.

I plan to make similar change for blacklisting GlobalVariables
(which is ASan-specific).

llvm-svn: 219997
2014-10-17 00:20:19 +00:00
David Majnemer bb525f7c20 CodeGen: Don't drop thread_local when emitting __thread aliases
CodeGen wouldn't mark the aliasee as thread_local if the aliasee was a
tentative definition.

Even if the definition was already emitted, it would never mark the
alias as thread_local.

This fixes PR21288.

llvm-svn: 219859
2014-10-15 22:38:23 +00:00
Alexey Samsonov 0b15e34bd1 Move SanitizerBlacklist object from CodeGenModule to ASTContext.
Soon we'll need to have access to blacklist before the CodeGen
phase (see http://reviews.llvm.org/D5687), so parse and construct
the blacklist earlier.

llvm-svn: 219857
2014-10-15 22:17:27 +00:00
Alexey Samsonov 8d043e82ef Move SanitizerBlacklist to clangBasic. NFC.
This change moves SanitizerBlacklist.h from lib/CodeGen
to public Clang headers in include/clang/Basic. SanitizerBlacklist
is currently only used in CodeGen to decide which functions/modules
should be instrumented, but this will soon change as ASan will
optionally modify class layouts during AST construction
(http://reviews.llvm.org/D5687). We need blacklist machinery
to be available at this point.

llvm-svn: 219840
2014-10-15 19:57:45 +00:00
Alexey Bataev ec4747802a Fix for bug http://llvm.org/PR17427.
Assertion failed: "Computed __func__ length differs from type!"
Reworked PredefinedExpr representation with internal StringLiteral field for function declaration.
Differential Revision: http://reviews.llvm.org/D5365

llvm-svn: 219393
2014-10-09 08:45:04 +00:00
Reid Kleckner 453e056467 Fix IRGen for referencing a static local before emitting its decl
Summary:
Previously CodeGen assumed that static locals were emitted before they
could be accessed, which is true for automatic storage duration locals.
However, it is possible to have CodeGen emit a nested function that uses
a static local before emitting the function that defines the static
local, breaking that assumption.

Fix it by creating the static local upon access and ensuring that the
deferred function body gets emitted. We may not be able to emit the
initializer properly from outside the function body, so don't try.

Fixes PR18020.  See also previous attempts to fix static locals in
PR6769 and PR7101.

Reviewers: rsmith

Subscribers: cfe-commits

Differential Revision: http://reviews.llvm.org/D4787

llvm-svn: 219265
2014-10-08 01:07:54 +00:00
David Majnemer b3341ea453 MS ABI: Implement thread_local for global variables
Summary:
This add support for the C++11 feature, thread_local global variables.
The ABI Clang implements is an improvement of the MSVC ABI.  Sadly,
further improvements could be made but not without sacrificing ABI
compatibility.

The feature is implemented as follows:
- All thread_local initialization routines are pointed to from the
  .CRT$XDU section.
- All non-weak thread_local variables have their initialization routines
  call from a single function instead of getting their own .CRT$XDU
  section entry.  This is done to open up optimization opportunities to
  the compiler.
- All weak thread_local variables have their own .CRT$XDU section entry.
  This entry is in a COMDAT with the global variable it is initializing;
  this ensures that we will initialize the global exactly once.
- Destructors are registered in the initialization function using
  __tlregdtor.

Differential Revision: http://reviews.llvm.org/D5597

llvm-svn: 219074
2014-10-05 05:05:40 +00:00
Reid Kleckner 739aa12b79 Revert "Don't use comdats for initializers on platforms that don't support it"
On further investigation, COMDATs should work with .ctors, and the issue
I was hitting probably reproduces with .init_array.

This reverts commit r218287.

llvm-svn: 218313
2014-09-23 16:20:01 +00:00
Reid Kleckner 6c03130542 Don't use comdats for initializers on platforms that don't support it
In particular, pre-.init_array ELF uses the .ctors section mechanism.
MinGW COFF also uses .ctors, now that I think about it. Therefore,
restrict this optimization to the two platforms that are currently known
to work: ELF with .init_array and COFF with .CRT$XCU.

llvm-svn: 218287
2014-09-23 00:00:14 +00:00
Dario Domizioli c4fb8ca7aa Fix ctor/dtor aliases losing 'dllexport' (for Itanium ABI)
This patch makes sure that the dllexport attribute is transferred to the alias when such alias is created. It only affects the Itanium ABI because for the MSVC ABI a workaround is in place to not generate aliases of dllexport ctors/dtors.
A new CodeGenModule function is provided, CodeGenModule::setAliasAttributes, to factor the code for transferring attributes to aliases.

llvm-svn: 218159
2014-09-19 22:06:24 +00:00
Rafael Espindola 9f834735cd Don't use the third field of llvm.global_ctors for MachO.
The field is defined as:

If the third field is present, non-null, and points to a global variable or function, the initializer function will only run if the associated data from the current module is not discarded.

And without COMDATs we can't implement that.

llvm-svn: 218097
2014-09-19 01:54:22 +00:00
Rafael Espindola 5b6fa2f85a Revert "Put more stuff in the comdat used for variables with static init."
This reverts commit r218089.
It looks like it was causing issues on COFF.

llvm-svn: 218094
2014-09-19 01:28:16 +00:00
Rafael Espindola c0ce9eca0b Put more stuff in the comdat used for variables with static init.
Clang can already handle

-------------------------------------------
struct S {
  static const int x;
};
template<typename T> struct U {
  static const int k;
};
template<typename T> const int U<T>::k = T::x;

const int S::x = 42;
extern const int *f();
const int *g() { return &U<S>::k; }
int main() {
  return *f() + U<S>::k;
}

const int *f() { return &U<S>::k; }
-------------------------------------------

since r217264 which puts the .inint_array section in the same COMDAT
as the variable.

This patch allows the linker to more easily delete some dead code and data by
putting the guard variable and init function in the same COMDAT.

llvm-svn: 218089
2014-09-18 23:41:44 +00:00
Rafael Espindola 1e4df92f49 Add support for putting constructors and destructos in explicit comdats.
There are situations when clang knows that the C1 and C2 constructors
or the D1 and D2 destructors are identical. We already optimize some
of these cases, but cannot optimize it when the GlobalValue is
weak_odr.

The problem with weak_odr is that an old TU seeing the same code will
have a C1 and a C2 comdat with the corresponding symbols. We cannot
suddenly start putting the C2 symbol in the C1 comdat as we cannot
guarantee that the linker will not pick a .o with only C1 in it.

The solution implemented by GCC is to expand the ABI to have a comdat
whose name uses a C5/D5 suffix and always has both symbols. That is
what this patch implements.

llvm-svn: 217874
2014-09-16 15:18:21 +00:00
Rafael Espindola 5368618d72 Reduce code duplication a bit more. NFC.
llvm-svn: 217811
2014-09-15 19:34:18 +00:00
Rafael Espindola 91f68b43c3 Move emitCXXStructor to CGCXXABI.
A followup patch will address the code duplication.

llvm-svn: 217807
2014-09-15 19:20:10 +00:00
Rafael Espindola 02b77f4248 Create a emitCXXStructor function and make the existing emitCXXConstructor and
emitCXXDestructor static helpers.

A next patch will make it a helper in CGCXXABI.

llvm-svn: 217804
2014-09-15 18:46:13 +00:00
Rafael Espindola 1ac0ec86b7 Merge GetAddrOfCXXConstructor and GetAddrOfCXXDonstructor. NFC.
llvm-svn: 217598
2014-09-11 15:42:06 +00:00
Nico Weber 17d3a2c3f8 Remove a parameter that has been unused since r188481. No behavior change.
llvm-svn: 217386
2014-09-08 16:26:36 +00:00
Rafael Espindola 8d2a19b478 Handle constructors and destructors a bit more uniformly in CodeGen.
There were code paths that are duplicated for constructors and destructors just
because we have both CXXCtorType and CXXDtorsTypes.

This patch introduces an unified enum and reduces code deplication a bit.

llvm-svn: 217383
2014-09-08 16:01:27 +00:00
James Dennett 64fa12372c Typo fix in comments: definintion -> definition
llvm-svn: 215764
2014-08-15 20:04:40 +00:00
Benjamin Kramer 2f5db8b3db Header guard canonicalization, clang part.
Modifications made by clang-tidy with minor tweaks.

llvm-svn: 215557
2014-08-13 16:25:19 +00:00
Alex Lorenz ee02499a8f Add coverage mapping generation.
This patch adds the '-fcoverage-mapping' option which
allows clang to generate the coverage mapping information
that can be used to provide code coverage analysis using
the execution counts obtained from the instrumentation 
based profiling (-fprofile-instr-generate).

llvm-svn: 214752
2014-08-04 18:41:51 +00:00
Alexey Samsonov 4b8de11c81 [Sanitizer] Introduce SanitizerMetadata class.
It is responsible for generating metadata consumed by sanitizer instrumentation
passes in the backend. Move several methods from CodeGenModule to SanitizerMetadata.
For now the class is stateless, but soon it won't be the case.

Instead of creating globals providing source-level information to ASan, we will create
metadata nodes/strings which will be turned into actual global variables in the
backend (if needed).

No functionality change.

llvm-svn: 214564
2014-08-01 21:35:28 +00:00
Richard Smith 00db2f13f8 PR20473: Don't "deduplicate" string literals with the same value but different
lengths! In passing, simplify string literal deduplication by relying on LLVM
to deduplicate the underlying constant values.

llvm-svn: 214222
2014-07-29 21:20:12 +00:00
Reid Kleckner 1a711b1696 -fms-extensions: Implement half of #pragma init_seg
Summary:
This pragma is very rare.  We could *hypothetically* lower some uses of
it down to @llvm.global_ctors, but given that GlobalOpt isn't able to
optimize prioritized global ctors today, there's really no point.

If we wanted to do this in the future, I would check if the section used
in the pragma started with ".CRT$XC" and had up to two characters after
it.  Those two characters could form the 16-bit initialization priority
that we support in @llvm.global_ctors.  We would have to teach LLVM to
lower prioritized global ctors on COFF as well.

This should let us compile some silly uses of this pragma in WebKit /
Blink.

Reviewers: rsmith, majnemer

Subscribers: cfe-commits

Differential Revision: http://reviews.llvm.org/D4549

llvm-svn: 213593
2014-07-22 00:53:05 +00:00
Alexey Samsonov 6c12414358 Make sure globals created by UBSan are not instrumented by ASan.
Summary:
This change adds description of globals created by UBSan
instrumentation (UBSan handlers, type descriptors, filenames) to
llvm.asan.globals metadata, effectively "blacklisting" them. This can
dramatically decrease the data section in binaries built with UBSan+ASan,
as UBSan tends to create a lot of handlers, and ASan instrumentation
increases the global size to at least 64 bytes.

Test Plan: clang regression test suite

Reviewers: rsmith

Reviewed By: rsmith

Subscribers: cfe-commits, byoungyoung, kcc

Differential Revision: http://reviews.llvm.org/D4575

llvm-svn: 213392
2014-07-18 17:50:06 +00:00
Alexey Samsonov 15c9669615 [ASan] Collect unmangled names of global variables in Clang to print them in error reports.
Currently ASan instrumentation pass creates a string with global name
for each instrumented global (to include global names in the error report). Global
name is already mangled at this point, and we may not be able to demangle it
at runtime (e.g. there is no __cxa_demangle on Android).

Instead, create a string with fully qualified global name in Clang, and pass it
to ASan instrumentation pass in llvm.asan.globals metadata. If there is no metadata
for some global, ASan will use the original algorithm.

This fixes https://code.google.com/p/address-sanitizer/issues/detail?id=264.

llvm-svn: 212872
2014-07-12 00:42:52 +00:00
Alexey Samsonov b2cc23df20 Be more specific about return types of some methods.
This would allow to call addCompilerUsedGlobal on some
Clang-generated globals.

llvm-svn: 212767
2014-07-10 22:18:36 +00:00
Alexey Samsonov b7dd329f2f Decouple llvm::SpecialCaseList text representation and its LLVM IR semantics.
Turn llvm::SpecialCaseList into a simple class that parses text files in
a specified format and knows nothing about LLVM IR. Move this class into
LLVMSupport library. Implement two users of this class:
  * DFSanABIList in DFSan instrumentation pass.
  * SanitizerBlacklist in Clang CodeGen library.
The latter will be modified to use actual source-level information from frontend
(source file names) instead of unstable LLVM IR things (LLVM Module identifier).

Remove dependency edge from ClangCodeGen/ClangDriver to LLVMTransformUtils.

No functionality change.

llvm-svn: 212643
2014-07-09 19:40:08 +00:00
Alexey Samsonov ac4afe49e7 [Sanitizer] Remove brittle cache variable and slightly simplify blacklisting code.
Now CodeGenFunction is responsible for looking at sanitizer blacklist
(in CodeGenFunction::StartFunction) and turning off instrumentation,
if necessary.

No functionality change.

llvm-svn: 212501
2014-07-07 23:59:57 +00:00
Alexey Samsonov e7a8ccfaad [Sanitizer] Reduce the usage of sanitizer blacklist in CodeGenModule
Get rid of cached CodeGenModule::SanOpts, which was used to turn off
sanitizer codegen options if current LLVM Module is blacklisted, and use
plain LangOpts.Sanitize instead.

1) Some codegen decisions (turning TBAA or writable strings on/off)
   shouldn't depend on the contents of blacklist.

2) llvm.asan.globals should *always* be created, even if the module
   is blacklisted - soon Clang's CodeGen where we read sanitizer
   blacklist files, so we should properly report which globals are
   blacklisted to the backend.

llvm-svn: 212499
2014-07-07 23:34:34 +00:00
David Majnemer e2cb8d198f CodeGen: Refactor RTTI emission
Let's not expose ABI specific minutia inside of CodeGenModule and Type.
Instead, let's abstract it through CXXABI.

This gets rid of:
CodeGenModule::getCompleteObjectLocator,
CodeGenModule::EmitFundamentalTypeDescriptor{s,},
CodeGenModule::getMSTypeDescriptor,
CodeGenModule::getMSCompleteObjectLocator,
CGCXXABI::shouldRTTIBeUnique,
CGCXXABI::classifyRTTIUniqueness.

CGRTTI was *almost* entirely centered around providing Itanium-style
RTTI information.  Instead of providing interfaces that only it
consumes, move it to the ItaniumCXXABI implementation file.  This allows
it to have access to Itanium-specific implementation details without
providing useless expansion points for the Microsoft ABI side.

Differential Revision: http://reviews.llvm.org/D4261

llvm-svn: 212435
2014-07-07 06:20:47 +00:00
Robert Lytton 57765d5347 Move the calling of emitTargetMD() later.
Summary:
Because a global created by GetOrCreateLLVMGlobal() is not finalised until later viz:
  extern char a[];
  char f(){ return a[5];}
  char a[10];

Change MangledDeclNames to use a MapVector rather than a DenseMap so that the
Metadata is output in order of original declaration, so to make deterministic
and improve human readablity.

Differential Revision: http://reviews.llvm.org/D4176

llvm-svn: 212263
2014-07-03 09:30:33 +00:00
Alexey Samsonov 4f319cca42 [ASan] Print exact source location of global variables in error reports.
See https://code.google.com/p/address-sanitizer/issues/detail?id=299 for the
original feature request.

Introduce llvm.asan.globals metadata, which Clang (or any other frontend)
may use to report extra information about global variables to ASan
instrumentation pass in the backend. This metadata replaces
llvm.asan.dynamically_initialized_globals that was used to detect init-order
bugs. llvm.asan.globals contains the following data for each global:
  1) source location (file/line/column info);
  2) whether it is dynamically initialized;
  3) whether it is blacklisted (shouldn't be instrumented).

Source location data is then emitted in the binary and can be picked up
by ASan runtime in case it needs to print error report involving some global.
For example:

  0x... is located 4 bytes to the right of global variable 'C::array' defined in '/path/to/file:17:8' (0x...) of size 40

These source locations are printed even if the binary doesn't have any
debug info.

This is an ABI-breaking change. ASan initialization is renamed to
__asan_init_v4(). Pre-built libraries compiled with older Clang will not work
with the fresh runtime.

llvm-svn: 212188
2014-07-02 16:54:41 +00:00
David Majnemer d905da4a5f MS ABI: Reference MSVC RTTI from the VFTable
The pointer for a class's RTTI data comes right before the VFTable but
has no name.  To be properly compatible with this, we do the following:
* Create a single GlobalVariable which holds the contents of the VFTable
  _and_ the pointer to the RTTI data.
* Create a GlobalAlias, with appropriate linkage/visibility, that points
  just after the RTTI data pointer.  This ensures that the VFTable
  symbol will always refer to VFTable data.
* Create a Comdat with a "Largest" SelectionKind and stick the private
  GlobalVariable in it.  By transitivity, the GlobalAlias will be a
  member of the Comdat group.  Using "Largest" ensures that foreign
  definitions without an RTTI data pointer will _not_ be chosen in the
  final linked image.

Whether or not we emit RTTI data depends on several things:
* The -fno-rtti flag implies that we should never not emit a pointer to
  RTTI data before the VFTable.
* __declspec(dllimport) brings in the VFTable from a remote DLL. Use an
  available_externally GlobalVariable to provide a local definition of
  the VFTable.  This means that we won't have any available_externally
  definitions of things like complete object locators.  This is
  acceptable because they are never directly referenced.

To my knowledge, this completes the implementation of MSVC RTTI code
generation.

Further semantic work should be done to properly support /GR-.

llvm-svn: 212125
2014-07-01 20:30:31 +00:00
Justin Bogner 40b8ba1496 CodeGen: Improve warnings about uninstrumented files when profiling
Improve the warning when building with -fprofile-instr-use and a file
appears not to have been profiled at all. This keys on whether a
function is defined in the main file or not to avoid false negatives
when one includes a header with functions that have been profiled.

llvm-svn: 211760
2014-06-26 01:45:07 +00:00
Alp Toker fb8d02b179 Implement -Wframe-larger-than backend diagnostic
Add driver and frontend support for the GCC -Wframe-larger-than=bytes warning.
This is the first GCC-compatible backend diagnostic built around LLVM's
reporting feature.

This commit adds infrastructure to perform reverse lookup from mangled names
emitted after LLVM IR generation. We use that to resolve precise locations and
originating AST functions, lambdas or block declarations to produce seamless
codegen-guided diagnostics.

An associated change, StringMap now maintains unique mangled name strings
instead of allocating copies. This is a net memory saving in C++ and a small
hit for C where we no longer reuse IdentifierInfo storage, pending further
optimisation.

llvm-svn: 210293
2014-06-05 22:10:59 +00:00
Alexey Samsonov adc9037b88 Remove the overload of GetAddrOfConstantString method
llvm-svn: 210214
2014-06-04 20:25:57 +00:00
Alexey Samsonov 175e52f57b Refactor and generalize GetAddrOfConstantString and GetAddrOfConstantStringFromLiteral.
Share mode code between these functions and re-structure them in a way
which shows how similar they actually are. The latter function works well
with literals of multi-byte chars and does a GlobalVariable name mangling
(if global strings are non-writable).

No functionality change.

llvm-svn: 210212
2014-06-04 19:56:57 +00:00
Alp Toker 0e64e0d4fd Eliminate redundant MangleBuffer class
The only remaining user didn't actually use the non-dynamic storage facility
this class provides.

The std::string is transitional and likely to be StringRefized shortly.

llvm-svn: 210058
2014-06-03 02:13:57 +00:00
Hans Wennborg 275efb9e50 Don't dllimport/export destructor variants implemented by thunks.
MSVC doesn't export these functions, so trying to import them doesnt' work.
Also, don't let any dll attributes on the CXXDestructorDecl influence the
thunk's linkage -- they should always be linkonce_odr.

This takes care of the FIXME's for this in Nico's tests.

Differential Revision: http://reviews.llvm.org/D3930

llvm-svn: 209706
2014-05-28 01:52:23 +00:00
Reid Kleckner 563f0e852c Use comdats to avoid double initialization of weak data
Initializers of global data that can appear multiple TUs (static data
members of class templates or __declspec(selectany) data) are now in a
comdat group keyed on the global variable being initialized.  On
non-Windows platforms, this is a code size and startup time
optimization.  On Windows, this is necessary for ABI compatibility with
MSVC.

Fixes PR16959.

Reviewers: rsmith

Differential Revision: http://reviews.llvm.org/D3811

llvm-svn: 209555
2014-05-23 21:13:45 +00:00
Warren Hunt 5c2b4ea662 [MS-ABI] Implements MS-compatible RTTI
Enables the emission of MS-compatible RTTI data structures for use with 
typeid, dynamic_cast and exceptions.  Does not implement dynamic_cast 
or exceptions.  As an artiface, typeid works in some cases but proper 
support an testing will coming in a subsequent patch.

majnemer has fuzzed the results.  Test cases included.

Differential Revision: http://reviews.llvm.org/D3833

llvm-svn: 209523
2014-05-23 16:07:43 +00:00
Rafael Espindola b73c973d3b Don't set unnamed_addr in CreateRuntimeVariable.
This was fairly broken. For example,

@__dso_handle would or would not get an unnamed_addr depending on how many
global destructors were used in a translation unit.

The consensus was that not every runtime variable is unnamed_addr and that
__dso_handle handle should not be, so just don't add unnamed_addr in
CreateRuntimeVariable.

llvm-svn: 209484
2014-05-22 23:33:27 +00:00
Craig Topper 8a13c4180e [C++11] Use 'nullptr'. CodeGen edition.
llvm-svn: 209272
2014-05-21 05:09:00 +00:00
Hans Wennborg 1a3a07471f Rename CodeGenModule::getLLVMLinkageforDeclarator -> getLLVMLinkageForDeclarator
No functionality change.

llvm-svn: 208808
2014-05-14 19:54:53 +00:00
Rafael Espindola ee076a27ce Update for llvm API change.
llvm-svn: 208717
2014-05-13 18:45:53 +00:00
Rafael Espindola e98ed2f49a Don't repeat function name in comment.
llvm-svn: 208387
2014-05-09 01:34:38 +00:00