These maps are only needed during the construction of a single region statement.
Clearing them is important, as we otherwise get an assert in case some of the
referenced values are erased before the RegionGenerator is deleted.
llvm-svn: 251341
On UNIX (but not Darwin) the username needs to be respected when creating a
temporary module directory, so that different users don't pollute each others'
module caches.
llvm-svn: 251340
This is a patch to improve StringTableBuilder's performance. That class'
finalize function is very hot particularly in LLD because the function
does tail-merge strings in string tables or SHF_MERGE sections.
Generic std::sort-style sorter is not efficient for sorting strings.
The function implemented in this patch seems to be more efficient.
Here's a benchmark of LLD to link Clang with or without this patch.
The numbers are medians of 50 runs.
-O0
real 0m0.455s
real 0m0.430s (5.5% faster)
-O3
real 0m0.487s
real 0m0.452s (7.2% faster)
Since that is a benchmark of the whole linker, the speedup of
StringTableBuilder itself is much more than that.
http://reviews.llvm.org/D14053
llvm-svn: 251337
Summary:
In `MismatchingNewDeleteDetector::analyzeInClassInitializer`, if
`Field`'s initializer expression is null, lookup the field in
implicit instantiation, and use found field's the initializer.
Reviewers: rsmith, rtrieu
Subscribers: cfe-commits
Differential Revision: http://reviews.llvm.org/D9898
llvm-svn: 251335
We should remove noalias along with dereference and dereference_or_null attributes
because statepoint could potentially touch the entire heap including noalias objects.
Differential Revision: http://reviews.llvm.org/D14032
llvm-svn: 251333
This patch fixes the ptrace interceptor for aarch64. The PTRACE_GETREGSET
ptrace syscall with with invalid memory might zero the iovec::iov_base
field and then masking the subsequent check after the syscall (since it
will be 0 and it will not trigger an invalid access). The fix is to copy
the value on a local variable and use its value on the checks.
The patch also adds more coverage on the Linux/ptrace.cc testcase by addding
check for PTRACE_GETREGSET for both general and floating registers (aarch64
definitions added only).
llvm-svn: 251331
This adds a couple of optimization remarks to the SamplePGO
transformation. When it decides to inline a hot function (to mimic the
inline stack and repeat useful inline decisions in the original build).
It will also report branch destinations. For instance, given the code
fragment:
6 if (i < 1000)
7 sum -= i;
8 else
9 sum += -i * rand();
If the 'else' branch is taken most of the time, building this code with
-Rpass=sample-profile will produce:
a.cc:9:14: remark: most popular destination for conditional branches at small.cc:6:9 [-Rpass=sample-profile]
sum += -i * rand();
^
llvm-svn: 251330
Python 3 has a different syntax for octal literals than Python 2
and they are incompatible with each other. Six doesn't provide
a transparent wrapper around this, so the most sane thing to do
is to not use octal literals. If you need an octal literal,
use a decimal literal and if it's not obvious what the value is,
provide the value in octal as a comment.
llvm-svn: 251328
Processing bitcode from a different LLVM version can lead to
unexpected behavior. The LLVM project guarantees autoupdating
bitcode from a previous minor revision for the same major, but
can't make any promise when reading bitcode generated from a
either a non-released LLVM, a vendor toolchain, or a "future"
LLVM release. This patch aims at being more user-friendly and
allows a bitcode produce to emit an optional block at the
beginning of the bitcode that will contains an opaque string
intended to describe the bitcode producer information. The
bitcode reader will dump this information alongside any error it
reports.
The optional block also includes an "epoch" number, monotonically
increasing when incompatible changes are made to the bitcode. The
reader will reject bitcode whose epoch is different from the one
expected.
Differential Revision: http://reviews.llvm.org/D13666
From: Mehdi Amini <mehdi.amini@apple.com>
llvm-svn: 251325
Android libc provides a fixed TLS slot for the unsafe stack pointer,
and this change implements direct access to that slot on AArch64 via
__builtin_thread_pointer() + offset.
This change also moves more code into TargetLowering and its
target-specific subclasses to get rid of target-specific codegen
in SafeStackPass.
This change does not touch the ARM backend because ARM lowers
builting_thread_pointer as aeabi_read_tp, which is not available
on Android.
The previous iteration of this change was reverted in r250461. This
version leaves the generic, compiler-rt based implementation in
SafeStack.cpp instead of moving it to TargetLoweringBase in order to
allow testing without a TargetMachine.
llvm-svn: 251324
We were previously overflowing a 32-bit multiply operation when emitting large
(>512MB) bitcode files, resulting in corrupted bitcode. Fix by extending
one of the operands to 64 bits.
There are a few other 32-bit integer types in this code that seem like they
also ought to be extended to 64 bits; this will be done separately.
llvm-svn: 251323
In PIC mode we were previously computing global variable addresses (or GOT
entry addresses) by adding the PC, the PC-relative GOT displacement and
the GOT-relative symbol/GOT entry displacement. Because the latter two
displacements are fixed, we ended up performing one more addition than
necessary.
This change causes us to compute addresses using a single PC-relative
displacement, resulting in a shorter code sequence. This reduces code size
by about 4% in a recent build of Chromium for Android.
As a result of this change we no longer need to compute the GOT base address
in the ARM backend, which allows us to remove the Global Base Reg pass and
SDAG lowering for the GOT.
We also now no longer use the GOT when addressing a symbol which is known
to be defined in the same linkage unit. Specifically, the symbol must have
either hidden visibility or a strong definition in the current module in
order to not use the the GOT.
This is a change from the previous behaviour where we would use the GOT to
address externally visible symbols defined in the same module. I think the
only cases where this could matter are cases involving symbol interposition,
but we don't really support that well anyway.
Differential Revision: http://reviews.llvm.org/D13650
llvm-svn: 251322
This patch enables the ptrace syscall interceptors for arm and adds support
for both PTRACE_GETVFPREGS and PTRACE_SETVFPREGS used to get the VFP register
from ARM.
The ptrace tests is also updated with arm and PTRACE_GETVFPREGS tests.
llvm-svn: 251321
This issue is triggered in PGO mode when bootstrapping LLVM. It seems that it is not guaranteed that edge weights are always greater than zero which are read from profile data.
llvm-svn: 251317
The problem was that the @skipIfNoSBHeaders on darwin was trying to use self.lib_dir when it hadn't been set yet.
I looked at the code and places were required to set "self.lib_dir" for no real reason as all places that used it just used the LLDB_LIB_DIR environment variable. So I removed all uses of self.lib_dir and replaced them to use 'os.environ["LLDB_LIB_DIR"]'. Did the same for self.implib_dir.
llvm-svn: 251315
The regex for -isystem matching is broken. -[D,I,Usystem] matches "-D", "-,",
"-I", "-U", "-s" "-y", etc. Besides that, "-isystem /foo" gets interpreted as
"-i" with a non-empty value "system" and thus the next "/foo" argument is not
read. This patch corrects the regex.
This fixes PR13237 <https://llvm.org/bugs/show_bug.cgi?id=13237>.
A patch by Peter Wu!
Differential Revision: http://reviews.llvm.org/D13800
llvm-svn: 251312
Also since we always read in the DWARF data or mmap it, we don't need to make a copy of the strings for the directories and file names, we can just store "cosnt char *" values. Every place that uses the prologues use them temporarily and then throw them away so no one is expecting the directory and filename strings to live longer than the parse functions.
llvm-svn: 251310
Python3 has no analogue to sys.maxint since ints in Python 3 have
arbitrary size. However, the distinction was not actually important
in any of these cases, and in a few cases using maxint was already
a bug to begin with.
llvm-svn: 251306
This avoids the need to query the PC for private resume operations (public resumes have the PC
from the bigger jStopInfo packet) and speeds up the stepping on an android target by about 10%
(it some cases even more).
llvm-svn: 251301
There was a threading issue in the ICF code for COFF. That seems like
a venign bug in the sense that it doesn't produce an incorrect output,
but it oftentimes misses reducible sections. As a result, mergeable
sections could remain in outputs, which makes the output nondeterministic.
Basically the algorithm we are using for ICF is this: We group sections
so that identical sections will eventually be in the same group. Initially,
all sections are in one group. We split the group by relocation targets
until we get a convergence (if relocation targets are in different gruops,
the sections are different). Once a group is split, they will never be
merged.
Each section has a group ID. That variable itself is atomic, so there's
no threading issue at the level that we can use thread sanitizer.
The point is, when we split a group, we re-assign new group IDs to group
of sections. That are multiple separate writes to atomic varaibles.
Thus, splitting a group is not an atomic operation, and there's a small
chance that the other thread observes inconsistent group IDs.
Over-splitting is always "safe", so it will never create incorrect output.
I suspect that the nondeterminism stems from that point. However, I
cannot prove or fix that at this moment, so I'm going to avoid using
threads here.
llvm-svn: 251300
Instead of XFAIL-ing the tests with the wrong usage of the "interrupt"
attribute, we should check that we emit the correct error messages to
the user.
llvm-svn: 251295
Even though we may not know the value of the shifter operand, it's possible we know the shifter operand is non-zero. This can allow us to infer more known bits - for example:
%1 = load %p !range {1, 5}
%2 = shl %q, %1
We don't know %1, but we do know that it is nonzero so %2[0] is known zero, and importantly %2 is known non-zero.
Calling isKnownNonZero is nontrivially expensive so use an Optional to run it lazily and cache its result.
llvm-svn: 251294