A pass that adds random noops to X86 binaries to introduce diversity with the goal of increasing security against most return-oriented programming attacks.
Command line options:
-noop-insertion // Enable noop insertion.
-noop-insertion-percentage=X // X% of assembly instructions will have a noop prepended (default: 50%, requires -noop-insertion)
-max-noops-per-instruction=X // Randomly generate X noops per instruction. ie. roll the dice X times with probability set above (default: 1). This doesn't guarantee X noop instructions.
In addition, the following 'quick switch' in clang enables basic diversity using default settings (currently: noop insertion and schedule randomization; it is intended to be extended in the future).
-fdiversify
This is the clang part of the patch.
llvm part: D3392
http://reviews.llvm.org/D3393
Patch by Stephen Crane (@rinon)
llvm-svn: 225910
Introduce the following -fsanitize-recover flags:
- -fsanitize-recover=<list>: Enable recovery for selected checks or
group of checks. It is forbidden to explicitly list unrecoverable
sanitizers here (that is, "address", "unreachable", "return").
- -fno-sanitize-recover=<list>: Disable recovery for selected checks or
group of checks.
- -f(no-)?sanitize-recover is now a synonym for
-f(no-)?sanitize-recover=undefined,integer and will soon be deprecated.
These flags are parsed left to right, and mask of "recoverable"
sanitizer is updated accordingly, much like what we do for -fsanitize= flags.
-fsanitize= and -fsanitize-recover= flag families are independent.
CodeGen change: If there is a single UBSan handler function, responsible
for implementing multiple checks, which have different recoverable setting,
then we emit two handler calls instead of one:
the first one for the set of "unrecoverable" checks, another one - for
set of "recoverable" checks. If all checks implemented by a handler have the
same recoverability setting, then the generated code will be the same.
llvm-svn: 225719
Allow blessed access to the symbol rewriter from the driver. Although the
symbol rewriter could be invoked through tools like opt and llc, it would not
accessible from the frontend. This allows us to read the rewrite map files in
the frontend rather than the backend and enable symbol rewriting for actually
performing the symbol interpositioning.
llvm-svn: 225504
If there are some non-ascii character in the input source code, the
column index might be smallar than the byte index. This will result
in two possible assertion failures. This CL fixes the computation of
the column index and byte index.
1. The assertion in startOfNextColumn() and startOfPreviousColumn()
should not be raised when the byte index is greater than the column
index since the non-ascii characters may use more than one bytes to
store a character in a column.
2. The length of the caret line should be equal to the number of columns
of source line, instead of the length of the source line. Otherwise,
the assertion in selectInterestingSourceRegion will be raised because
the removed columns plus the kept columns are not greater than the max
column, which means that we should not remove any column at all.
llvm-svn: 225442
getMainExecutable() returns a std::string, assigning its result
to StringRef immediately creates a dangling pointer. This was
detected by half-broken fast-MSan-bootstrap bot.
llvm-svn: 224956
a CLANG_LIBDIR_SUFFIX down from the build system and using that as part
of the default resource dir computation.
Without this, essentially nothing that uses the clang driver works when
building clang with a libdir suffix. This is probably the single biggest
missing piece of support for multilib as without this people could hack
clang to end up installed in the correct location, but it would then
fail to find its own basic resources. I know of at least one distro that
has some variation on this patch to hack around this; hopefully they'll
be able to use the libdir suffix functionality directly as the rest of
these bits land.
This required fixing a copy of the code to compute Clang's resource
directory that is buried inside of the frontend (!!!). It had bitrotted
significantly relative to the driver code. I've made it essentially
a clone of the driver code in order to keep tests (which use cc1
heavily) passing. This copy should probably just be removed and the
frontend taught to always rely on an explicit resource directory from
the driver, but that is a much more invasive change for another day.
I've also updated one test which actually encoded the resource directory
in its checked output to tolerate multilib suffixes.
Note that this relies on a prior LLVM commit to add a stub to the
autoconf build system for this variable.
llvm-svn: 224924
-trigraphs is now an alias for -ftrigraphs. -fno-trigraphs makes it possible
to explicitly disable trigraphs, which couldn't be done before.
clang -std=c++11 -fno-trigraphs
now builds without GNU extensions, but with trigraphs disabled. Previously,
trigraphs were only disabled in GNU modes or with -std=c++1z.
Make the new -f flags the cc1 interface too. This requires changing -trigraphs
to -ftrigraphs in a few cc1 tests.
Related to PR21974.
llvm-svn: 224790
The default value of Opts.Trigraphs now no longer depends solely on the
language input kind, so move the code out of setLangDefaults(). Also make
sure that Opts.MSVCCompat is set before the Trigraph code runs.
Related to PR21974.
llvm-svn: 224719
Remove Sema::UnqualifiedTyposCorrected, a cache of corrected typos. It would only cache typo corrections that didn't provide ValidateCandidate of which there were few left, and it had a bug when we had the same identifier spelled wrong twice. See the last two tests in typo-correction.cpp for cases this fires.
llvm-svn: 224375
Bitfield RefersToEnclosingLocal of Stmt::DeclRefExprBitfields renamed to RefersToCapturedVariable to reflect latest changes introduced in commit 224323. Also renamed method Expr::refersToEnclosingLocal() to Expr::refersToCapturedVariable() and comments for constant arguments.
No functional changes.
llvm-svn: 224329
Currently, if global variable is marked as a private OpenMP variable, the compiler crashes in debug version or generates incorrect code in release version. It happens because in the OpenMP region the original global variable is used instead of the generated private copy. It happens because currently globals variables are not captured in the OpenMP region.
This patch adds capturing of global variables iff private copy of the global variable must be used in the OpenMP region.
Differential Revision: http://reviews.llvm.org/D6259
llvm-svn: 224323
Mixed path separators (ie, both / and \\) can mess up the sort order
of the VFS map when dumping module dependencies, as was recently
exposed by r224055 and papered over in r224145. Instead, we should
simply use native paths for consistency.
This also adds a TODO to add handling of .. in paths. There was some
code for this before r224055, but it was untested and probably broken.
llvm-svn: 224164
components. These sometimes get synthetically added, and we don't want -Ifoo
and -I./foo to be treated fundamentally differently here.
llvm-svn: 224055
arithmetic relaxation flags:
-cl-no-signed-zeros
-cl-unsafe-math-optimizations
-cl-finite-math-only
-cl-fast-relaxed-math
Propagate the info to FP instruction flags as well
as function attributes where they are available.
llvm-svn: 223928
This is a temporary workaround while MIPS64 has not yet fully supported
128-bit integers. But declaration of int128 type is necessary even though
`__SIZEOF_INT128__` is undefined because c++ standard header files like
`limits` throw error message if `__int128` is not available.
Patch by Sagar Thakur.
Differential Revision: http://reviews.llvm.org/D6402
llvm-svn: 223927
Original commit message:
[modules] Add experimental -fmodule-map-file-home-is-cwd flag to -cc1.
For files named by -fmodule-map-file=, and files found by 'extern module'
directives, this flag specifies that we should resolve filenames relative to
the current working directory rather than relative to the directory in which
the module map file resides. This is aimed at fixing path handling, in
particular for relative -I paths, when building modules that represent
components of the current project (rather than libraries installed on the
current system, which the current project has as dependencies, where we'd
typically expect the module map files to be looked up implicitly).
llvm-svn: 223913
For files named by -fmodule-map-file=, and files found by 'extern module'
directives, this flag specifies that we should resolve filenames relative to
the current working directory rather than relative to the directory in which
the module map file resides. This is aimed at fixing path handling, in
particular for relative -I paths, when building modules that represent
components of the current project (rather than libraries installed on the
current system, which the current project has as dependencies, where we'd
typically expect the module map files to be looked up implicitly).
llvm-svn: 223753
Summary:
Allow CUDA host device functions with two code paths using __CUDA_ARCH__
to differentiate between code path being compiled.
For example:
__host__ __device__ void host_device_function(void) {
#ifdef __CUDA_ARCH__
device_only_function();
#else
host_only_function();
#endif
}
Patch by Jacques Pienaar.
Reviewed By: rnk
Differential Revision: http://reviews.llvm.org/D6457
llvm-svn: 223271
rather than trying to extract this information from the FileEntry after the
fact.
This has a number of beneficial effects. For instance, diagnostic messages for
failed module builds give a path relative to the "module root" rather than an
absolute file path, and the contents of the module includes file is no longer
dependent on what files the including TU happened to inspect prior to
triggering the module build.
llvm-svn: 223095
Summary:
Make DiagnosticsEngine::takeClient return std::unique_ptr<>. Updated
callers to store conditional ownership using a pair of pointer and unique_ptr
instead of a pointer + bool. Updated code that temporarily registers clients to
use the non-owning registration (+ removed extra calls to takeClient).
Reviewers: dblaikie
Reviewed By: dblaikie
Subscribers: cfe-commits
Differential Revision: http://reviews.llvm.org/D6294
llvm-svn: 222193
This option was misleading because it looked like it enabled the
language feature of SEH (__try / __except), when this option was really
controlling which EH personality function to use. Mingw only supports
SEH and SjLj EH on x86_64, so we can simply do away with this flag.
llvm-svn: 221963
This fixes an assertion when running clang-tidy on a file having
--serialize-diagnostics in compiler options. Committing a regression test
for clang-tidy separately.
Patch by Aaron Wishnick!
llvm-svn: 221884
Summary:
This change makes the asan-coverge (formerly -mllvm -asan-coverge)
accessible via a clang flag.
Companion patch to LLVM is http://reviews.llvm.org/D6152
Test Plan: regression tests, chromium
Reviewers: samsonov
Reviewed By: samsonov
Subscribers: cfe-commits
Differential Revision: http://reviews.llvm.org/D6153
llvm-svn: 221719
For all threadprivate variables which have constructor/destructor emit call to void __kmpc_threadprivate_register(ident_t * <Current Location>, void *<Original Global Addr>, kmpc_ctor <Constructor>, kmpc_cctor NULL, kmpc_dtor <Destructor>);
In expressions all references to such variables are replaced by calls to void *__kmpc_threadprivate_cached(ident_t *<Current Location>, kmp_int32 <Current Thread Id>, void *<Original Global Addr>, size_t <Size of Data>, void ***<Pointer to autogenerated cache – array of private copies of threadprivate variable>);
Test test/OpenMP/threadprivate_codegen.cpp checks that codegen is correct. Also it checks that codegen is correct after serialization/deserialization and one of passes verifies debug info.
Differential Revision: http://reviews.llvm.org/D4002
llvm-svn: 221663
Get rid of ugly SanitizerOptions class thrust into LangOptions:
* Make SanitizeAddressFieldPadding a regular language option,
and rely on default behavior to initialize/reset it.
* Make SanitizerBlacklistFile a regular member LangOptions.
* Introduce the helper class "SanitizerSet" to represent the
set of enabled sanitizers and make it a member of LangOptions.
It is exactly the entity we want to cache and modify in CodeGenFunction,
for instance. We'd also be able to reuse SanitizerSet in
CodeGenOptions for storing the set of recoverable sanitizers,
and in the Driver to represent the set of sanitizers
turned on/off by the commandline flags.
No functionality change.
llvm-svn: 221653
Use the bitmask to store the set of enabled sanitizers instead of a
bitfield. On the negative side, it makes syntax for querying the
set of enabled sanitizers a bit more clunky. On the positive side, we
will be able to use SanitizerKind to eventually implement the
new semantics for -fsanitize-recover= flag, that would allow us
to make some sanitizers recoverable, and some non-recoverable.
No functionality change.
llvm-svn: 221558
Currently, when --serialize-diagnostics is passed this only includes
the diagnostics from clang -cc1, and driver diagnostics are
dropped. This causes issues for tools that use the serialized
diagnostics, since stderr is lost and these diagnostics aren't seen at
all.
We handle this by merging the diagnostics from the CC1 process and the
driver diagnostics into a single file when the driver invokes CC1.
Fixes rdar://problem/10585062
llvm-svn: 220525
Summary:
When using a profile, we used to require the use -gmlt so that we could
get access to the line locations. This is used to match line numbers in
the input profile to the line numbers in the function's IR.
But this is actually not necessary. The driver can provide source
location tracking without the emission of debug information. In these
cases, the annotation 'llvm.dbg.cu' is missing from the IR, but the
actual line location annotations are still present.
This patch tells the driver to only emit source location tracking
when -fprofile-sample-use is present in the command line.
Reviewers: echristo, dblaikie
Subscribers: llvm-commits
Differential Revision: http://reviews.llvm.org/D5888
llvm-svn: 220383
Implicit module builds are not well-suited to a lot of build systems. In
particular, they fare badly in distributed build systems, and they lead to
build artifacts that are not tracked as part of the usual dependency management
process. This change allows explicitly-built module files (which are already
supported through the -emit-module flag) to be explicitly loaded into a build,
allowing build systems to opt to manage module builds and dependencies
themselves.
This is only the first step in supporting such configurations, and it should
be considered experimental and subject to change or removal for now.
llvm-svn: 220359
This is long-since overdue, and matches GCC 5.0. This should also be
backwards-compatible, because we already supported all of C11 as an extension
in C99 mode.
llvm-svn: 220244
#include_next interacts poorly with modules: it depends on where in the list of
include paths the current file was found. Files covered by module maps are not
found in include search paths when building the module (and are not found in
include search paths when @importing the module either), so this isn't really
meaningful. Instead, we fake up the result that #include_next *should* have
given: find the first path that would have resulted in the given file being
picked, and search from there onwards.
llvm-svn: 220177
After http://reviews.llvm.org/D5687 is submitted, we will need
SanitizerBlacklist before the CodeGen phase, so make it a LangOpt
(as it will actually affect ABI / class layout).
llvm-svn: 219842
The various ways to create an ASTUnit all take a refcounted pointer to
a diagnostics engine as an argument, and if it isn't pointing at
anything they initialize it. This is a pretty confusing API, and it
really makes more sense for the caller to initialize the thing since
they control the lifetime anyway.
This fixes the one caller that didn't bother initializing the pointer
and asserts that the argument is initialized.
llvm-svn: 219752
I'd mispelled "Bitcode/BitCodes.h" before, and tested on a case
insensitive filesystem.
This reverts commit r219649, effectively re-applying r219647 and
r219648.
llvm-svn: 219664
In cases of nested module builds, or when you care how long module builds take,
this information was not previously easily available / obvious.
llvm-svn: 219658
The bots can't seem to find an include file. Reverting for now and
I'll look into it in a bit.
This reverts commits r219647 and r219648.
llvm-svn: 219649
We currently read serialized diagnostics directly in the C API, which
makes it difficult to reuse this logic elsewhere. This extracts the
core of the serialized diagnostic parsing logic into a base class that
can be subclassed using a visitor pattern.
llvm-svn: 219647
Summary:
This change adds an experimental flag -fsanitize-address-field-padding=N (0, 1, 2)
to clang and driver. With this flag ASAN will be able to detect some cases of
intra-object-overflow bugs,
see https://code.google.com/p/address-sanitizer/wiki/IntraObjectOverflow
There is no actual functionality here yet, just the flag parsing.
The functionality is being reviewed at http://reviews.llvm.org/D5687
Test Plan: Build and run SPEC, LLVM Bootstrap, Chrome with this flag.
Reviewers: samsonov
Reviewed By: samsonov
Subscribers: cfe-commits
Differential Revision: http://reviews.llvm.org/D5676
llvm-svn: 219417
There's probably never a good reason to iterate over unique_ptrs. This
lets us use range-for and say Job.foo instead of (*it)->foo in a few
places.
llvm-svn: 218938
Otherwise we can end up silently skipping an import. If we happen to be
building another module at the time, we may build a mysteriously broken
module and not know why it seems to be missing symbols.
llvm-svn: 218552
This is another case of conditional ownership (in this case a raw
reference, plus a boolean to indicate whether the referenced object
should be deleted). While it's not ideal, I prefer to make the ownership
explicit with a unique_ptr than using a boolean flag (though it does
make the reference and the unique_ptr redundant in the sense that they
both refer to the same memory). At some point we might write a reusable
conditional ownership pointer (a stateful custom deleter for a unique_ptr
may be appropriate).
Based on a patch from a patch by Anton Yartsev.
llvm-svn: 217791
This adds a flag called -fseh-exceptions that uses the native Windows
.pdata and .xdata unwind mechanism to throw exceptions. The other EH
possibilities are DWARF and SJLJ exceptions.
Patch by Martell Malone!
Reviewed By: asl, rnk
Differential Revision: http://reviews.llvm.org/D3419
llvm-svn: 217790
1. We were hitting the NextIsPrevious assertion because we were trying
to merge decl chains that were independent of each other because we had
no Sema object to allow them to find existing decls. This is fixed by
delaying loading the "preloaded" decls until Sema is available.
2. We were trying to get identifier info from an annotation token, which
asserts. The fix is to special-case the module annotations in the
preprocessed output printer.
Fixed in a single commit because when you hit 1 you almost invariably
hit 2 as well.
llvm-svn: 217550
It is very common to include headers with DOS-style line endings, such
as windows.h, from source files with Unix-style line endings.
Previously, we would end up with mixed line endings and #endifs that
appeared to be on the same line:
#if 0 /* expanded by -frewrite-includes */
#include <windows.h>^M#endif /* expanded by -frewrite-includes */
Clang treats either of \r or \n as a line ending character, so this is
purely a cosmetic issue.
This has no automated test because most Unix tools on Windows will
implictly convert CRLF to LF when reading files, making it very hard to
detect line ending mismatches. FileCheck doesn't understand {{\r}}
either.
Fixes PR20552.
llvm-svn: 217259
People have been incorrectly using "-analyzer-disable-checker" to
silence analyzer warnings on a file, when analyzing a project. Add
the "-analyzer-disable-all-checks" option, which would allow the
suppression and suggest it as part of the error message for
"-analyzer-disable-checker". The idea here is to compose this with
"--analyze" so that users can selectively opt out specific files from
static analysis.
llvm-svn: 216763
In theory, it'd be nice if we could move to a case where all buried
pointers were buried via unique_ptr to demonstrate that the program had
finished with the value (that we could really have cleanly deallocated
it) but instead chose to bury it.
I think the main reason that's not possible right now is the various
IntrusiveRefCntPtrs in the Frontend, sharing ownership for a variety of
compiler bits (see the various similar
"CompilerInstance::releaseAndLeak*" functions). I have yet to figure out
their correct ownership semantics - but perhaps, even if the
intrusiveness can be removed, the shared ownership may yet remain and
that would lead to a non-unique burying as is there today. (though we
could model that a little better - by passing in a shared_ptr, etc -
rather than needing the two step that's currently used in those other
releaseAndLeak* functions)
This might be a bit more robust if BuryPointer took the boolean:
BuryPointer(bool, unique_ptr<T>)
and the choice to bury was made internally - that way, even when
DisableFree was not set, the unique_ptr would still be null in the
caller and there'd be no chance of accidentally having a different
codepath where the value is used after burial in !DisableFree, but it
becomes null only in DisableFree, etc...
llvm-svn: 216742
Rather than having a pair of pairs and a reference out parameter, build
a structure with everything together and named. A raw pointer and a
unique_ptr, rather than a raw pointer and a boolean, are used to
communicate ownership transfer.
It's possible one day we'll end up with a conditional pointer (probably
represented by a raw pointer and a boolean) abstraction to use in places
like this. Conditional ownership seems to be coming up more often than
I'd hoped...
llvm-svn: 216712
This change is the last in the pack of five commits
(also see r216691, r216694, r216695, and r216696) that reduces the number
of test failures in "check-clang" invocation in UBSan bootstrap
from 2443 down to 5.
llvm-svn: 216697
Only those callers who are dynamically passing ownership should need the
3 argument form. Those accepting the default ("do pass ownership")
should do so explicitly with a unique_ptr now.
llvm-svn: 216614
ACLE 2.0 allows __fp16 to be used as a function argument or return
type. This enables this for AArch64.
This also fixes an existing bug that causes clang to not allow
homogeneous floating-point aggregates with a base type of __fp16. This
is valid for AAPCS64, but not for AAPCS-VFP.
llvm-svn: 216558
Currently the analyzer lazily models some functions using 'BodyFarm',
which constructs a fake function implementation that the analyzer
can simulate that approximates the semantics of the function when
it is called. BodyFarm does this by constructing the AST for
such definitions on-the-fly. One strength of BodyFarm
is that all symbols and types referenced by synthesized function
bodies are contextual adapted to the containing translation unit.
The downside is that these ASTs are hardcoded in Clang's own
source code.
A more scalable model is to allow these models to be defined as source
code in separate "model" files and have the analyzer use those
definitions lazily when a function body is needed. Among other things,
it will allow more customization of the analyzer for specific APIs
and platforms.
This patch provides the initial infrastructure for this feature.
It extends BodyFarm to use an abstract API 'CodeInjector' that can be
used to synthesize function bodies. That 'CodeInjector' is
implemented using a new 'ModelInjector' in libFrontend, which lazily
parses a model file and injects the ASTs into the current translation
unit.
Models are currently found by specifying a 'model-path' as an
analyzer option; if no path is specified the CodeInjector is not
used, thus defaulting to the current behavior in the analyzer.
Models currently contain a single function definition, and can
be found by finding the file <function name>.model. This is an
initial starting point for something more rich, but it bootstraps
this feature for future evolution.
This patch was contributed by Gábor Horváth as part of his
Google Summer of Code project.
Some notes:
- This introduces the notion of a "model file" into
FrontendAction and the Preprocessor. This nomenclature
is specific to the static analyzer, but possibly could be
generalized. Essentially these are sources pulled in
exogenously from the principal translation.
Preprocessor gets a 'InitializeForModelFile' and
'FinalizeForModelFile' which could possibly be hoisted out
of Preprocessor if Preprocessor exposed a new API to
change the PragmaHandlers and some other internal pieces. This
can be revisited.
FrontendAction gets a 'isModelParsingAction()' predicate function
used to allow a new FrontendAction to recycle the Preprocessor
and ASTContext. This name could probably be made something
more general (i.e., not tied to 'model files') at the expense
of losing the intent of why it exists. This can be revisited.
- This is a moderate sized patch; it has gone through some amount of
offline code review. Most of the changes to the non-analyzer
parts are fairly small, and would make little sense without
the analyzer changes.
- Most of the analyzer changes are plumbing, with the interesting
behavior being introduced by ModelInjector.cpp and
ModelConsumer.cpp.
- The new functionality introduced by this change is off-by-default.
It requires an analyzer config option to enable.
llvm-svn: 216550