In certain cases, the compiler might try to merge __stack_chk_guard with
another global variable. (Or someone could theoretically define
__stack_chk_guard as an alias.) In that case, make sure we don't crash.
Differential Revision: https://reviews.llvm.org/D45746
llvm-svn: 330495
Summary:
ubsan found that we sometimes pass nullptr to memcpy in
SectionChunk::writeTo(). This change adds a check that avoids that.
Reviewers: ruiu
Reviewed By: ruiu
Subscribers: llvm-commits
Differential Revision: https://reviews.llvm.org/D45789
llvm-svn: 330490
When creating a call to storeStrong in ObjCARCContract, ensure the call
gets the correct funclet token, otherwise WinEHPrepare will turn the
call (and all subsequent instructions) into unreachable.
We already have logic to do this for the ARC autorelease elision marker;
factor that out into a common function that's used for both. These are
the only two places in this transform that create call instructions.
Differential Revision: https://reviews.llvm.org/D45857
llvm-svn: 330487
Split the fp and integer vector logical instruction scheduler classes - older CPUs especially often handled these on different pipes.
This unearthed a couple of things that are also handled in this patch:
(1) We were tagging avx512 fp logic ops as WriteFAdd, probably because of the lack of WriteFLogic
(2) SandyBridge had integer logic ops only using Port5, when afaict they can use Ports015.
(3) Cleaned up x86 FCHS/FABS scheduling as they are typically treated as fp logic ops.
Differential Revision: https://reviews.llvm.org/D45629
llvm-svn: 330480
This is what link.exe does and lets us avoid needing to worry about
merging output characteristics while adding input sections to output
sections.
With this change we can't process /merge in the same way as before
because sections with different output characteristics can still
be merged into one another. So this change moves the processing of
/merge to just before we assign addresses. In the case where there
are multiple output sections with the same name, link.exe only merges
the first section with the source name into the first section with
the target name, and we do the same.
At the same time I also implemented transitive merging (which means
that /merge:.c=.b /merge:.b=.a merges both .c and .b into .a).
This isn't quite enough though because link.exe has a special case for
.CRT in 32-bit mode: it processes sections whose output characteristics
are DATA | R | W as though the output characteristics were DATA | R
(so that they get merged into things like constructor lists in the
expected way). Chromium has a few such sections, and it turns out
that those sections were causing the problem that resulted in r318699
(merge .xdata into .rdata) being reverted: because of the previous
permission merging semantics, the .CRT sections were causing the entire
.rdata section to become writable, which caused the SEH runtime to
crash because it apparently requires .xdata to be read-only. This
change also implements the same special case.
This should unblock being able to merge .xdata into .rdata by default,
as well as .bss into .data, both of which will be done in followups.
Differential Revision: https://reviews.llvm.org/D45801
llvm-svn: 330479
This diff fixes sh_link for various types of sections
(i.e. for SHT_ARM_EXIDX, SHT_HASH). In particular, this change enables us
to use llvm-objcopy with clang -gsplit-dwarf for the target android-arm.
Test plan: make check-all
Differential revision: https://reviews.llvm.org/D45851
llvm-svn: 330478
Summary: The LIBOMPTARGET_NVPTX_DEBUG flag is inconsistent between using nvcc to generate .a file and clang to generate .bc file. Sync the two setting so we can get debug messages from the bc file path as well.
Reviewers: grokos
Subscribers: Hahnfeld, openmp-commits, mgorny
Tags: #openmp
Differential Revision: https://reviews.llvm.org/D45530
llvm-svn: 330477
Summary:
Support the dynamic shadow memory offset (the default case for user
space now) and static non-zero shadow memory offset
(-hwasan-mapping-offset option). Keeping the the latter case around
for functionality and performance comparison tests (and mostly for
-hwasan-mapping-offset=0 case).
The implementation is stripped down ASan one, picking only the relevant
parts in the following assumptions: shadow scale is fixed, the shadow
memory is dynamic, it is accessed via ifunc global, shadow memory address
rematerialization is suppressed.
Keep zero-based shadow memory for kernel (-hwasan-kernel option) and
calls instreumented case (-hwasan-instrument-with-calls option), which
essentially means that the generated code is not changed in these cases.
Reviewers: eugenis
Subscribers: srhines, llvm-commits
Differential Revision: https://reviews.llvm.org/D45840
llvm-svn: 330475
Summary:
Retire the fixed shadow memory mapping to avoid conflicts with default
process memory mapping (currently manifests on Android).
Tests on AArch64 show <1% performance loss and code size increase,
making it possible to use dynamic shadow memory by default.
For the simplicity and unifirmity sake, use dynamic shadow memory mapping
with base address accessed via ifunc resolver on all supported platforms.
Keep the fixed shadow memory mapping around to be able to run
performance comparison tests later.
Complementing D45840.
Reviewers: eugenis
Subscribers: srhines, kubamracek, dberris, mgorny, kristof.beyls, delcypher, #sanitizers, llvm-commits
Differential Revision: https://reviews.llvm.org/D45847
llvm-svn: 330474
The callback used to create an ORE for the legacy PI pass caches the allocated
object in a unique_ptr in the runOnModule function, and returns a reference to
that object. Under certian circumstances we can end up holding onto that
reference after the OREs destruction. Rather then allowing the new and legacy
passes to create ORE object in diffrent ways, create the ORE at the point of
use.
Differential Revision: https://reviews.llvm.org/D43219
llvm-svn: 330473
There was some unfortunate interaction between VSPLAT and BITCAST
related to the selection of constant vectors (coming from selecting
shuffles). Introduce VSPLATW that always splats a 32-bit word, and
can have arbitrary result type (to avoid BITCASTs of VSPLAT).
Clean up the previous selection of BITCAST/VSPLAT.
llvm-svn: 330471
Although sprintf is not intercepted on Windows, this test can pass
if sprintf calls memmove, which is intercepted, so we can't XFAIL it.
Differential Revision: https://reviews.llvm.org/D45894
llvm-svn: 330469
Before this patch, ISL_ASSERT only printed an error message to stderr.
This can be easily missed if the program continues or just fails later.
To fail-early and help error diagnostics (e.g. using bugpoint), call
abort() when an assertion does not hold.
I seem to just have forgotten to add this abort() when I originally
proposed the ISL_ASSERT macro.
Suggested-By: Eli Friedman <efriedma@codeaurora.org>
Differential Revision: https://reviews.llvm.org/D45171
llvm-svn: 330467
Add the switch -polly-debug-func to define the name of a debug
function. This function is ignored for any validity check.
Its purpose is to allow to observe a value after transformation by a
SCoP, and to follow which statements are executed in which order. For
instance, consider the following code:
static void dbg_printf(int sum, int i) {
fprintf(stderr, "The value of sum is %d, i=%d\n", sum, i);
fflush(stderr);
}
void func(int n) {
int sum = 0;
for (int i = 0; i < 16; i+=1) {
sum += i;
dbg_printf(sum, i);
}
}
Executing this after Polly's codegen with -polly-debug-func=dbg_printf
reveals the new execution order and the assumed values at that point of
execution.
Differential Revision: https://reviews.llvm.org/D45728
llvm-svn: 330466
Three new instructions:
umonitor - Sets up a linear address range to be
monitored by hardware and activates the monitor.
The address range should be a writeback memory
caching type.
umwait - A hint that allows the processor to
stop instruction execution and enter an
implementation-dependent optimized state
until occurrence of a class of events.
tpause - Directs the processor to enter an
implementation-dependent optimized state
until the TSC reaches the value in EDX:EAX.
Also modifying the description of the mfence
instruction, as the rep prefix (0xF3) was allowed
before, which would conflict with umonitor during
disassembly.
Before:
$ echo 0xf3,0x0f,0xae,0xf0 | llvm-mc -disassemble
.text
mfence
After:
$ echo 0xf3,0x0f,0xae,0xf0 | llvm-mc -disassemble
.text
umonitor %rax
Reviewers: craig.topper, zvi
Reviewed By: craig.topper
Differential Revision: https://reviews.llvm.org/D45253
llvm-svn: 330462
It's possible to have an empty object file, for example if you
just compile an empty .c file. This file won't have any sections
so asserting that a file has chunks is definitely wrong.
llvm-svn: 330461
First off, this is more correct than having the B. Second off, this was making
a bot upset. This fixes that.
Update the test to include -verify-machineinstrs as well to prevent stuff like
this slipping by non debug/assert builds in the future.
llvm-svn: 330459
Summary:
Example:
Printf("%.*s", 5, "123");
should yield:
'123 '
In case Printf's requested string precision is larger than the string
argument, the resulting string should be padded up to the requested
precision.
For the simplicity sake, implementing right padding only.
Reviewers: eugenis
Subscribers: kubamracek, delcypher, #sanitizers, llvm-commits
Differential Revision: https://reviews.llvm.org/D45844
llvm-svn: 330458
Part of the DBI stream is a list of variable length structures
describing each module that contributes to the final executable.
One member of this structure is a section contribution entry that
describes the first section contribution in the output file for
the given module.
We have been leaving this structure unpopulated until now, so with
this patch it is now filled out correctly.
Differential Revision: https://reviews.llvm.org/D45832
llvm-svn: 330457
It was added 6.5 years ago in r144345, but was never hooked up and has been
unused since. If _you_ do use this, feel free to revert, but add a comment
on where it's used.
https://reviews.llvm.org/D45262
llvm-svn: 330455
If we don't mark the cfi line as optional, the script won't
work with 'nounwind' code. Without that attr, there may be
extra noise in the asm body that we don't want to see.
llvm-svn: 330453
Right now we only use this information in one place, immediately after
we calculate it, but it's still nice information to have. The Swift
project is going to use this to tidy up its "API notes" feature (see
past discussion on cfe-dev that never quite converged).
Reviewed by Bruno Cardoso Lopes.
llvm-svn: 330452
The isOverload() method needs to account for situations where the two
methods being compared don't have the same number of arguments.
rdar://problem/39542960
llvm-svn: 330450
Enables cleaning up confusion between which name variables are mangled
and which are unmangled, and --print-gc-sections then excersises and
tests that.
Differential Revision: https://reviews.llvm.org/D44440
llvm-svn: 330449
Some targets need special LLVM calling convention for CUDA kernel.
This patch does that through a TargetCodeGenInfo hook.
It only affects amdgcn target.
Patch by Greg Rodgers.
Revised and lit tests added by Yaxun Liu.
Differential Revision: https://reviews.llvm.org/D45223
llvm-svn: 330447