Since CallExpr::setNumArgs has been removed, it is now possible to store the
callee expression and the argument expressions of CallExpr in a trailing array.
This saves one pointer per CallExpr, CXXOperatorCallExpr, CXXMemberCallExpr,
CUDAKernelCallExpr and UserDefinedLiteral.
Given that CallExpr is used as a base of the above classes we cannot use
llvm::TrailingObjects. Instead we store the offset in bytes from the this pointer
to the start of the trailing objects and manually do the casts + arithmetic.
Some notes:
1.) I did not try to fit the number of arguments in the bit-fields of Stmt.
This leaves some space for future additions and avoid the discussion about
whether x bits are sufficient to hold the number of arguments.
2.) It would be perfectly possible to recompute the offset to the trailing
objects before accessing the trailing objects. However the trailing objects
are frequently accessed and benchmarks show that it is slightly faster to
just load the offset from the bit-fields. Additionally, because of 1),
we have plenty of space in the bit-fields of Stmt.
Differential Revision: https://reviews.llvm.org/D55771
Reviewed By: rjmccall
llvm-svn: 349910
All of the other constructors already take a reference to the AST context.
This avoids calling Decl::getASTContext in most cases. Additionally move
the definition of the constructor from Expr.h to Expr.cpp since it is calling
DeclRefExpr::computeDependence. NFC.
llvm-svn: 349901
Summary:
Add an option to initialize automatic variables with either a pattern or with
zeroes. The default is still that automatic variables are uninitialized. Also
add attributes to request uninitialized on a per-variable basis, mainly to disable
initialization of large stack arrays when deemed too expensive.
This isn't meant to change the semantics of C and C++. Rather, it's meant to be
a last-resort when programmers inadvertently have some undefined behavior in
their code. This patch aims to make undefined behavior hurt less, which
security-minded people will be very happy about. Notably, this means that
there's no inadvertent information leak when:
- The compiler re-uses stack slots, and a value is used uninitialized.
- The compiler re-uses a register, and a value is used uninitialized.
- Stack structs / arrays / unions with padding are copied.
This patch only addresses stack and register information leaks. There's many
more infoleaks that we could address, and much more undefined behavior that
could be tamed. Let's keep this patch focused, and I'm happy to address related
issues elsewhere.
To keep the patch simple, only some `undef` is removed for now, see
`replaceUndef`. The padding-related infoleaks are therefore not all gone yet.
This will be addressed in a follow-up, mainly because addressing padding-related
leaks should be a stand-alone option which is implied by variable
initialization.
There are three options when it comes to automatic variable initialization:
0. Uninitialized
This is C and C++'s default. It's not changing. Depending on code
generation, a programmer who runs into undefined behavior by using an
uninialized automatic variable may observe any previous value (including
program secrets), or any value which the compiler saw fit to materialize on
the stack or in a register (this could be to synthesize an immediate, to
refer to code or data locations, to generate cookies, etc).
1. Pattern initialization
This is the recommended initialization approach. Pattern initialization's
goal is to initialize automatic variables with values which will likely
transform logic bugs into crashes down the line, are easily recognizable in
a crash dump, without being values which programmers can rely on for useful
program semantics. At the same time, pattern initialization tries to
generate code which will optimize well. You'll find the following details in
`patternFor`:
- Integers are initialized with repeated 0xAA bytes (infinite scream).
- Vectors of integers are also initialized with infinite scream.
- Pointers are initialized with infinite scream on 64-bit platforms because
it's an unmappable pointer value on architectures I'm aware of. Pointers
are initialize to 0x000000AA (small scream) on 32-bit platforms because
32-bit platforms don't consistently offer unmappable pages. When they do
it's usually the zero page. As people try this out, I expect that we'll
want to allow different platforms to customize this, let's do so later.
- Vectors of pointers are initialized the same way pointers are.
- Floating point values and vectors are initialized with a negative quiet
NaN with repeated 0xFF payload (e.g. 0xffffffff and 0xffffffffffffffff).
NaNs are nice (here, anways) because they propagate on arithmetic, making
it more likely that entire computations become NaN when a single
uninitialized value sneaks in.
- Arrays are initialized to their homogeneous elements' initialization
value, repeated. Stack-based Variable-Length Arrays (VLAs) are
runtime-initialized to the allocated size (no effort is made for negative
size, but zero-sized VLAs are untouched even if technically undefined).
- Structs are initialized to their heterogeneous element's initialization
values. Zero-size structs are initialized as 0xAA since they're allocated
a single byte.
- Unions are initialized using the initialization for the largest member of
the union.
Expect the values used for pattern initialization to change over time, as we
refine heuristics (both for performance and security). The goal is truly to
avoid injecting semantics into undefined behavior, and we should be
comfortable changing these values when there's a worthwhile point in doing
so.
Why so much infinite scream? Repeated byte patterns tend to be easy to
synthesize on most architectures, and otherwise memset is usually very
efficient. For values which aren't entirely repeated byte patterns, LLVM
will often generate code which does memset + a few stores.
2. Zero initialization
Zero initialize all values. This has the unfortunate side-effect of
providing semantics to otherwise undefined behavior, programs therefore
might start to rely on this behavior, and that's sad. However, some
programmers believe that pattern initialization is too expensive for them,
and data might show that they're right. The only way to make these
programmers wrong is to offer zero-initialization as an option, figure out
where they are right, and optimize the compiler into submission. Until the
compiler provides acceptable performance for all security-minded code, zero
initialization is a useful (if blunt) tool.
I've been asked for a fourth initialization option: user-provided byte value.
This might be useful, and can easily be added later.
Why is an out-of band initialization mecanism desired? We could instead use
-Wuninitialized! Indeed we could, but then we're forcing the programmer to
provide semantics for something which doesn't actually have any (it's
uninitialized!). It's then unclear whether `int derp = 0;` lends meaning to `0`,
or whether it's just there to shut that warning up. It's also way easier to use
a compiler flag than it is to manually and intelligently initialize all values
in a program.
Why not just rely on static analysis? Because it cannot reason about all dynamic
code paths effectively, and it has false positives. It's a great tool, could get
even better, but it's simply incapable of catching all uses of uninitialized
values.
Why not just rely on memory sanitizer? Because it's not universally available,
has a 3x performance cost, and shouldn't be deployed in production. Again, it's
a great tool, it'll find the dynamic uses of uninitialized variables that your
test coverage hits, but it won't find the ones that you encounter in production.
What's the performance like? Not too bad! Previous publications [0] have cited
2.7 to 4.5% averages. We've commmitted a few patches over the last few months to
address specific regressions, both in code size and performance. In all cases,
the optimizations are generally useful, but variable initialization benefits
from them a lot more than regular code does. We've got a handful of other
optimizations in mind, but the code is in good enough shape and has found enough
latent issues that it's a good time to get the change reviewed, checked in, and
have others kick the tires. We'll continue reducing overheads as we try this out
on diverse codebases.
Is it a good idea? Security-minded folks think so, and apparently so does the
Microsoft Visual Studio team [1] who say "Between 2017 and mid 2018, this
feature would have killed 49 MSRC cases that involved uninitialized struct data
leaking across a trust boundary. It would have also mitigated a number of bugs
involving uninitialized struct data being used directly.". They seem to use pure
zero initialization, and claim to have taken the overheads down to within noise.
Don't just trust Microsoft though, here's another relevant person asking for
this [2]. It's been proposed for GCC [3] and LLVM [4] before.
What are the caveats? A few!
- Variables declared in unreachable code, and used later, aren't initialized.
This goto, Duff's device, other objectionable uses of switch. This should
instead be a hard-error in any serious codebase.
- Volatile stack variables are still weird. That's pre-existing, it's really
the language's fault and this patch keeps it weird. We should deprecate
volatile [5].
- As noted above, padding isn't fully handled yet.
I don't think these caveats make the patch untenable because they can be
addressed separately.
Should this be on by default? Maybe, in some circumstances. It's a conversation
we can have when we've tried it out sufficiently, and we're confident that we've
eliminated enough of the overheads that most codebases would want to opt-in.
Let's keep our precious undefined behavior until that point in time.
How do I use it:
1. On the command-line:
-ftrivial-auto-var-init=uninitialized (the default)
-ftrivial-auto-var-init=pattern
-ftrivial-auto-var-init=zero -enable-trivial-auto-var-init-zero-knowing-it-will-be-removed-from-clang
2. Using an attribute:
int dont_initialize_me __attribute((uninitialized));
[0]: https://users.elis.ugent.be/~jsartor/researchDocs/OOPSLA2011Zero-submit.pdf
[1]: https://twitter.com/JosephBialek/status/1062774315098112001
[2]: https://outflux.net/slides/2018/lss/danger.pdf
[3]: https://gcc.gnu.org/ml/gcc-patches/2014-06/msg00615.html
[4]: 776a0955ef
[5]: http://wg21.link/p1152
I've also posted an RFC to cfe-dev: http://lists.llvm.org/pipermail/cfe-dev/2018-November/060172.html
<rdar://problem/39131435>
Reviewers: pcc, kcc, rsmith
Subscribers: JDevlieghere, jkorous, dexonsmith, cfe-commits
Differential Revision: https://reviews.llvm.org/D54604
llvm-svn: 349442
pass in the -target-sdk-version to the compiler and backend
This commit adds support for reading the SDKSettings.json file in the Darwin
driver. This file is used by the driver to determine the SDK's version, and it
uses that information to pass it down to the compiler using the new
-target-sdk-version= option. This option is then used to set the appropriate
SDK Version module metadata introduced in r349119.
Note: I had to adjust the two ast tests as the SDKROOT environment variable
on macOS caused SDK version to be picked up for the compilation of source file
but not the AST.
rdar://45774000
Differential Revision: https://reviews.llvm.org/D55673
llvm-svn: 349380
Summary:
There are certain cases when normal C/C++ lookup (localUncachedLookup)
does not find AST nodes. E.g.:
Example 1:
template <class T>
struct X {
friend void foo(); // this is never found in the DC of the TU.
};
Example 2:
// The fwd decl to Foo is not found in the lookupPtr of the DC of the
// translation unit decl.
struct A { struct Foo *p; };
In these cases we create a new node instead of returning with the old one.
To fix it we create a new lookup table which holds every node and we are
not interested in any C++ specific visibility considerations.
Simply, we must know if there is an existing Decl in a given DC.
Reviewers: a_sidorin, a.sidorin
Subscribers: mgorny, rnkovacs, dkrupp, Szelethus, cfe-commits
Differential Revision: https://reviews.llvm.org/D53708
llvm-svn: 349351
Implement options in clang to enable recording the driver command-line
in an ELF section.
Implement a new special named metadata, llvm.commandline, to support
frontends embedding their command-line options in IR/ASM/ELF.
This differs from the GCC implementation in some key ways:
* In GCC there is only one command-line possible per compilation-unit,
in LLVM it mirrors llvm.ident and multiple are allowed.
* In GCC individual options are separated by NULL bytes, in LLVM entire
command-lines are separated by NULL bytes. The advantage of the GCC
approach is to clearly delineate options in the face of embedded
spaces. The advantage of the LLVM approach is to support merging
multiple command-lines unambiguously, while handling embedded spaces
with escaping.
Differential Revision: https://reviews.llvm.org/D54487
Clang Differential Revision: https://reviews.llvm.org/D54489
llvm-svn: 349155
Move some diagnostics around between Diagnostic*Kinds.td files. Diagnostics
used in multiple places were moved to DiagnosticCommonKinds.td. Diagnostics
listed in the wrong place (ie, Sema diagnostics listed in
DiagnosticsParseKinds.td) were moved to the correct places. One diagnostic
split into two so that the diagnostic string is in the .td file instead of in
code. Cleaned up the diagnostic includes after all the changes.
llvm-svn: 349125
It is faster to directly call the ObjC runtime for methods such as alloc/allocWithZone instead of sending a message to those functions.
This patch adds support for converting messages to alloc/allocWithZone to their equivalent runtime calls.
Tests included for the positive case of applying this transformation, negative tests that we ensure we only convert "alloc" to objc_alloc, not "alloc2", and also a driver test to ensure we enable this only for supported runtime versions.
Reviewed By: rjmccall
https://reviews.llvm.org/D55349
llvm-svn: 348687
Summary:
The intention is to make the tools replaying compilations from 'compile_commands.json'
(clang-tidy, clangd, etc.) find the same standard library as the original compiler
specified in 'compile_commands.json'.
Previously, the library detection logic was in the frontend (InitHeaderSearch.cpp) and relied
on the value of resource dir as an approximation of the compiler install dir. The new logic
uses the actual compiler install dir and is performed in the driver. This is consistent with
the C++ standard library detection on other platforms and allows to override the resource dir
in the tools using the compile_commands.json without altering the
standard library detection mechanism. The tools have to override the resource dir to make sure
they use a consistent version of the builtin headers.
There is still logic in InitHeaderSearch that attemps to add the absolute includes for the
the C++ standard library, so we keep passing the -stdlib=libc++ from the driver to the frontend
via cc1 args to avoid breaking that. In the long run, we should move this logic to the driver too,
but it could potentially break the library detection on other systems, so we don't tackle it in this
patch to keep its scope manageable.
This is a second attempt to fix the issue, first one was commited in r346652 and reverted in r346675.
The original fix relied on an ad-hoc propagation (bypassing the cc1 flags) of the install dir from the
driver to the frontend's HeaderSearchOptions. Unsurpisingly, the propagation was incomplete, it broke
the libc++ detection in clang itself, which caused LLDB tests to break.
The LLDB tests pass with new fix.
Reviewers: JDevlieghere, arphaman, EricWF
Reviewed By: arphaman
Subscribers: mclow.lists, ldionne, dexonsmith, ioeric, christof, kadircet, cfe-commits
Differential Revision: https://reviews.llvm.org/D54630
llvm-svn: 348365
When debugging a boost build with a modified
version of Clang, I discovered that the PTH implementation
stores TokenKind in 8 bits. However, we currently have 368
TokenKinds.
The result is that the value gets truncated and the wrong token
gets picked up when including PTH files. It seems that this will
go wrong every time someone uses a token that uses the 9th bit.
Upon asking on IRC, it was brought up that this was a highly
experimental features that was considered a failure. I discovered
via googling that BoostBuild (mostly Boost.Math) is the only user of
this
feature, using the CC1 flag directly. I believe that this can be
transferred over to normal PCH with minimal effort:
https://github.com/boostorg/build/issues/367
Based on advice on IRC and research showing that this is a nearly
completely unused feature, this patch removes it entirely.
Note: I considered leaving the build-flags in place and making them
emit an error/warning, however since I've basically identified and
warned the only user, it seemed better to just remove them.
Differential Revision: https://reviews.llvm.org/D54547
Change-Id: If32744275ef1f585357bd6c1c813d96973c4d8d9
llvm-svn: 348266
When the global new and delete operators aren't declared, Clang
provides and implicit declaration, but this declaration currently
always uses the default visibility. This is a problem when the
C++ library itself is being built with non-default visibility because
the implicit declaration will force the new and delete operators to
have the default visibility unlike the rest of the library.
The existing workaround is to use assembly to enforce the visiblity:
https://fuchsia.googlesource.com/zircon/+/master/system/ulib/zxcpp/new.cpp#108
but that solution is not always available, e.g. in the case of of
libFuzzer which is using an internal version of libc++ that's also built
with -fvisibility=hidden where the existing behavior is causing issues.
This change introduces a new option -fvisibility-global-new-delete-hidden
which makes the implicit declaration of the global new and delete
operators hidden.
Differential Revision: https://reviews.llvm.org/D53787
llvm-svn: 348234
In earlier patches regarding AnalyzerOptions, a lot of effort went into
gathering all config options, and changing the interface so that potential
misuse can be eliminited.
Up until this point, AnalyzerOptions only evaluated an option when it was
querried. For example, if we had a "-no-false-positives" flag, AnalyzerOptions
would store an Optional field for it that would be None up until somewhere in
the code until the flag's getter function is called.
However, now that we're confident that we've gathered all configs, we can
evaluate off of them before analysis, so we can emit a error on invalid input
even if that prticular flag will not matter in that particular run of the
analyzer. Another very big benefit of this is that debug.ConfigDumper will now
show the value of all configs every single time.
Also, almost all options related class have a similar interface, so uniformity
is also a benefit.
The implementation for errors on invalid input will be commited shorty.
Differential Revision: https://reviews.llvm.org/D53692
llvm-svn: 348031
This patch passes -fdebug-prefix-map (a feature for renaming source
paths in the debug info) through to the per-module codegen options and
adds the debug prefix map to the module hash.
<rdar://problem/46045865>
Differential Revision: https://reviews.llvm.org/D55037
llvm-svn: 347926
This adds Hurd toolchain support to Clang's driver in addition
to handling translating the triple from Hurd-compatible form to
the actual triple registered in LLVM.
(Phabricator was stripping the empty files from the patch so I
manually created them)
Patch by sthibaul (Samuel Thibault)
Differential Revision: https://reviews.llvm.org/D54379
llvm-svn: 347833
Summary:
the previous patch (https://reviews.llvm.org/rC346642) has been reverted because of test failure under windows.
So this patch fix the test cfe/trunk/test/CodeGen/code-coverage-filter.c.
Reviewers: marco-c
Reviewed By: marco-c
Subscribers: cfe-commits, sylvestre.ledru
Differential Revision: https://reviews.llvm.org/D54600
llvm-svn: 347144
Summary:
Experience has shown that the functionality is useful. It makes linking
optimized clang with debug info for me a lot faster, 20s to 13s. The
type merging phase of PDB writing goes from 10s to 3s.
This removes the LLVM cl::opt and replaces it with a metadata flag.
After this change, users can do the following to use ghash:
- add -gcodeview-ghash to compiler flags
- replace /DEBUG with /DEBUG:GHASH in linker flags
Reviewers: zturner, hans, thakis, takuto.ikuta
Subscribers: aprantl, hiraditya, JDevlieghere, llvm-commits
Differential Revision: https://reviews.llvm.org/D54370
llvm-svn: 347072
Previously these would be transformed into annotation tokens and the
preprocessor would then assume they were real tokens with source
locations and assert/UB.
Other pragmas that produce annotation tokens aren't a problem because
they aren't handled if the parser isn't hooked up - ParsePragma.cpp
registers those handlers & isn't run for pure preprocessing. So they're
treated as unknown pragmas & printed verbatim by the preprocessor.
Perhaps these pragmas should be treated the same way? But they got mixed
in with other __debug pragmas that do need to be handled during
preprocessing.
The third __debug pragma that produces an annotation token is 'captured'
- which had its own fix for this issue - by not inserting the annotation
token in the first place if it detected that it was in preprocessing
mode. I've removed that fix (from Lex/Pragma.cpp) in favor of the more
general one in Frontend/PrintPreprocessedOutput.cpp.
llvm-svn: 346928
This unfortunately results in a substantial breaking change when
switching to C++20, but it's not yet clear what / how much we should
do about that. We may want to add a compatibility conversion from
u8 string literals to const char*, similar to how C++98 provided a
compatibility conversion from string literals to non-const char*,
but that's not handled by this patch.
The feature can be disabled in C++20 mode with -fno-char8_t.
llvm-svn: 346892
The DWARF5 specification says(Appendix F.1):
"The sections that do not require relocation, however, can be
written to the relocatable object (.o) file but ignored by the
linker or they can be written to a separate DWARF object (.dwo)
file that need not be accessed by the linker."
The first part describes a single file split DWARF feature and there
is no way to trigger this behavior atm.
Fortunately, no many changes are required to keep *.dwo sections
in a .o, the patch does that.
Differential revision: https://reviews.llvm.org/D52296
llvm-svn: 346837
Summary:
This saves a lot of relocations in optimized object files (at the cost
of some cost/increase in linked executable bytes), but gold's 32 bit
gdb-index support has a bug (
https://sourceware.org/bugzilla/show_bug.cgi?id=21894 ) so we can't
switch to this unconditionally. (& even if it weren't for that bug, one
might argue that some users would want to optimize in one direction or
the other - prioritizing object size or linked executable size)
Differential Revision: https://reviews.llvm.org/D54243
llvm-svn: 346789
Summary:
When they read compiler args from compile_commands.json.
This change allows to run clang-based tools, like clang-tidy or clangd,
built from head using the compile_commands.json file produced for XCode
toolchains.
On MacOS clang can find the C++ standard library relative to the
compiler installation dir.
The logic to do this was based on resource dir as an approximation of
where the compiler is installed. This broke the tools that read
'compile_commands.json' and don't ship with the compiler, as they
typically change resource dir.
To workaround this, we now use compiler install dir detected by the driver
to better mimic the behavior of the original compiler when replaying the
compilations using other tools.
Reviewers: sammccall, arphaman, EricWF
Reviewed By: sammccall
Subscribers: ioeric, christof, kadircet, cfe-commits
Differential Revision: https://reviews.llvm.org/D54310
llvm-svn: 346652
Summary:
These options are taking regex separated by colons to filter files.
- if both are empty then all files are instrumented
- if -fprofile-filter-files is empty then all the filenames matching any of the regex from exclude are not instrumented
- if -fprofile-exclude-files is empty then all the filenames matching any of the regex from filter are instrumented
- if both aren't empty then all the filenames which match any of the regex in filter and which don't match all the regex in filter are instrumented
- this patch is a follow-up of https://reviews.llvm.org/D52033
Reviewers: marco-c, vsk
Reviewed By: marco-c, vsk
Subscribers: cfe-commits, sylvestre.ledru
Differential Revision: https://reviews.llvm.org/D52034
llvm-svn: 346642
Fix places where the return type of a FunctionDecl was being used in
place of the function type
FunctionDecl::Create() takes as its T parameter the type of function
that should be created, not the return type. Passing in the return type
looks to have been copypasta'd around a bit, but the number of correct
usages outweighs the incorrect ones so I've opted for keeping what T is
the same and fixing up the call sites instead.
This fixes a crash in Clang when attempting to compile the following
snippet of code with -fblocks -fsanitize=function -x objective-c++ (my
original repro case):
void g(void(^)());
void f()
{
__block int a = 0;
g(^(){ a++; });
}
as well as the following which only requires -fsanitize=function -x c++:
void f(char * buf)
{
__builtin_os_log_format(buf, "");
}
Patch by: Ben (bobsayshilol)
Differential revision: https://reviews.llvm.org/D53263
llvm-svn: 346601
The current version only emits the below error for a module (attempted to be loaded) from the `prebuilt-module-path`:
```
error: module file blabla.pcm cannot be loaded due to a configuration mismatch with the current compilation [-Wmodule-file-config-mismatch]
```
With this change, if the prebuilt module is used, we allow the proper diagnostic behind the configuration mismatch to be shown.
```
error: POSIX thread support was disabled in PCH file but is currently enabled
error: module file blabla.pcm cannot be loaded due to a configuration mismatch with the current compilation [-Wmodule-file-config-mismatch]
```
(A few lines later an error is emitted anyways, so there is no reason not to complain for configuration mismatches if a config mismatch is found and kills the build.)
Reviewed By: dblaikie
Tags: #clang
Differential Revision: https://reviews.llvm.org/D53334
llvm-svn: 346439
This reverts commit r345963. We have a path forward now.
Original commit message:
The driver accidentally stopped passing the input filenames on to -cc1
in this mode due to confusion over what action was being requested.
This change also fixes a couple of crashes I encountered when passing
multiple files to such a -cc1 invocation.
llvm-svn: 346130
Summary:
This CL adds /Zc:DllexportInlines flag to clang-cl.
When Zc:DllexportInlines- is specified, inline class member function is not exported if the function does not have local static variables.
By not exporting inline function, code for those functions are not generated and that reduces both compile time and obj size. Also this flag does not import inline functions from dllimported class if the function does not have local static variables.
On my 24C48T windows10 machine, build performance of chrome target in chromium repository is like below.
These stats are come with 'target_cpu="x86" enable_nacl = false is_component_build=true dcheck_always_on=true` build config and applied
* https://chromium-review.googlesource.com/c/chromium/src/+/1212379
* https://chromium-review.googlesource.com/c/v8/v8/+/1186017
Below stats were taken with this patch applied on a05115cd4c
| config | build time | speedup | build dir size |
| with patch, PCH on, debug | 1h10m0s | x1.13 | 35.6GB |
| without patch, PCH on, debug | 1h19m17s | | 49.0GB |
| with patch, PCH off, debug | 1h15m45s | x1.16 | 33.7GB |
| without patch, PCH off, debug | 1h28m10s | | 52.3GB |
| with patch, PCH on, release | 1h13m13s | x1.22 | 26.2GB |
| without patch, PCH on, release | 1h29m57s | | 37.5GB |
| with patch, PCH off, release | 1h23m38s | x1.32 | 23.7GB |
| without patch, PCH off, release | 1h50m50s | | 38.7GB |
This patch reduced obj size and the number of exported symbols largely, that improved link time too.
e.g. link time stats of blink_core.dll become like below
| | cold disk cache | warm disk cache |
| with patch, PCH on, debug | 71s | 30s |
| without patch, PCH on, debug | 111s | 48s |
This patch's implementation is based on Nico Weber's patch. I modified to support static local variable, added tests and took stats.
Bug: https://bugs.llvm.org/show_bug.cgi?id=33628
Reviewers: hans, thakis, rnk, javed.absar
Reviewed By: hans
Subscribers: kristof.beyls, smeenai, dschuff, probinson, cfe-commits, eraman
Differential Revision: https://reviews.llvm.org/D51340
llvm-svn: 346069
Handle it in the driver and propagate it to cc1
Reviewers: rjmccall, kcc, rsmith
Subscribers: cfe-commits
Differential Revision: https://reviews.llvm.org/D52615
llvm-svn: 346001
target/teams/distribute regions.
Target/teams/distribute regions exist for all the time the kernel is
executed. Thus, if the variable is declared in their context and then
escape it, we can allocate global memory statically instead of
allocating it dynamically.
Patch captures all the globalized variables in target/teams/distribute
contexts, merges them into the records, one per each target region.
Those records are then joined into the union, one per compilation unit
(to save the global memory). Those units are organized into
2 x dimensional arrays, where the first dimension is
the number of blocks per SM and the second one is the number of SMs.
Runtime functions manage this global memory space between the executing
teams.
llvm-svn: 345978
This reverts commit r345803 and r345915 (a follow-up fix to r345803).
Reason: r345803 blocks our internal integrate because of the new
warnings showing up in too many places. The fix is actually correct,
we will reland it after figuring out how to integrate properly.
llvm-svn: 345963
This patch should not introduce any behavior changes. It consists of
mostly one of two changes:
1. Replacing fall through comments with the LLVM_FALLTHROUGH macro
2. Inserting 'break' before falling through into a case block consisting
of only 'break'.
We were already using this warning with GCC, but its warning behaves
slightly differently. In this patch, the following differences are
relevant:
1. GCC recognizes comments that say "fall through" as annotations, clang
doesn't
2. GCC doesn't warn on "case N: foo(); default: break;", clang does
3. GCC doesn't warn when the case contains a switch, but falls through
the outer case.
I will enable the warning separately in a follow-up patch so that it can
be cleanly reverted if necessary.
Reviewers: alexfh, rsmith, lattner, rtrieu, EricWF, bollu
Differential Revision: https://reviews.llvm.org/D53950
llvm-svn: 345882
-fsyntax-only.
The driver accidentally stopped passing the input filenames on to -cc1
in this mode due to confusion over what action was being requested.
This change also fixes a couple of crashes I encountered when passing
multiple files to such a -cc1 invocation.
llvm-svn: 345803
We haven't supported compiling ObjC1 for a long time (and never will again), so
there isn't any reason to keep these separate. This patch replaces
LangOpts::ObjC1 and LangOpts::ObjC2 with LangOpts::ObjC.
Differential revision: https://reviews.llvm.org/D53547
llvm-svn: 345637
Default property value 'true' preserves current behavior. Value 'false' can be
used to create VFS "root", file system that gives better control over which
files compiler can use during compilation as there are no unpredictable
accesses to real file system.
Non-fallthrough use case changes how we treat multiple VFS overlay
files. Instead of all of them being at the same level just above a real
file system, now they are nested and subsequent overlays can refer to
files in previous overlays.
Change is done both in LLVM and Clang, corresponding LLVM commit is r345431.
rdar://problem/39465552
Reviewers: bruno, benlangmuir
Reviewed By: bruno
Subscribers: dexonsmith, cfe-commits, hiraditya
Differential Revision: https://reviews.llvm.org/D50539
llvm-svn: 345432
Summary:
- Add `UETT_PreferredAlignOf` to account for the difference between `__alignof` and `alignof`
- `AlignOfType` now returns ABI alignment instead of preferred alignment iff clang-abi-compat > 7, and one uses _Alignof or alignof
Patch by Nicole Mazzuca!
Differential Revision: https://reviews.llvm.org/D53207
llvm-svn: 345419
Add a new driver level flag `-fcf-runtime-abi=` that allows one to specify the
runtime ABI for CoreFoundation. This controls the language interoperability.
In particular, this is relevant for generating the CFConstantString classes
(primarily through the `__builtin___CFStringMakeConstantString` builtin) which
construct a reference to the "CFObject"'s `isa` field. This type differs
between swift 4.1 and 4.2+.
Valid values for the new option include:
- objc [default behaviour] - enable ObjectiveC interoperability
- swift-4.1 - enable interoperability with swift 4.1
- swift-4.2 - enable interoperability with swift 4.2
- swift-5.0 - enable interoperability with swift 5.0
- swift [alias] - target the latest swift ABI
Furthermore, swift 4.2+ changed the layout for the CFString when building
CoreFoundation *without* ObjectiveC interoperability. In such a case, a field
was added to the CFObject base type changing it from: <{ const int*, int }> to
<{ uintptr_t, uintptr_t, uint64_t }>.
In swift 5.0, the CFString type will be further adjusted to change the length
from a uint32_t on everything but BE LP64 targets to uint64_t.
Note that the default behaviour for clang remains unchanged and the new layout
must be explicitly opted into via `-fcf-runtime-abi=swift*`.
llvm-svn: 345222
'ignore-non-existent-contents' stopped working after r342232 in a way
that the actual attribute value isn't used and it works as if it is
always `true`.
Common use case for VFS iteration is iterating through files in umbrella
directories for modules. Ability to detect if some VFS entries point to
non-existing files is nice but non-critical. Instead of adding back
support for `'ignore-non-existent-contents': false` I am removing the
attribute, because such scenario isn't used widely enough and stricter
checks don't provide enough value to justify the maintenance.
rdar://problem/45176119
Reviewers: bruno
Reviewed By: bruno
Subscribers: hiraditya, dexonsmith, sammccall, cfe-commits
Differential Revision: https://reviews.llvm.org/D53228
llvm-svn: 345212