Summary:
Removing the dropped symbols will prevent indirect call promotion in the
ThinLTO Backend from adding a new reference to a symbol, which can
result in linker unsats. This can happen when we compile with a sample
profile collected from one binary by used for another, which may have
profiled targets that aren't used in the new binary.
Note that until dropDeadSymbols handles variables and aliases (in
progress), we may not be able to remove the declaration and can still
have an issue.
Reviewers: grimar, davidxl
Subscribers: mehdi_amini, inglorion, llvm-commits, eraman
Differential Revision: https://reviews.llvm.org/D42816
llvm-svn: 324299
Summary:
This patch (on top of https://reviews.llvm.org/D35755) provides the clang side necessary
to enable the Solaris port of the sanitizers implemented by https://reviews.llvm.org/D40898,
https://reviews.llvm.org/D40899, and https://reviews.llvm.org/D40900).
A few features of note:
* While compiler-rt cmake/base-config-ix.cmake (COMPILER_RT_OS_DIR) places
the runtime libs in a tolower(CMAKE_SYSTEM_NAME) directory, clang defaults to
the OS part of the target triplet (solaris2.11 in the case at hand). The patch makes
them agree on compiler-rt's idea.
* While Solaris ld accepts a considerable number of GNU ld options for compatibility,
it only does so for the double-dash forms. clang unfortunately is inconsistent here
and sometimes uses the double-dash form, sometimes the single-dash one that
confuses the hell out of Solaris ld. I've changed the affected places to use the double-dash
form that should always work.
* As described in https://reviews.llvm.org/D40899, Solaris ld doesn't create the
__start___sancov_guards/__stop___sancov_guards labels gld/gold/lld do, so I'm
including additional runtime libs into the link that provide them.
* One test uses -fstack-protector, but unlike other systems libssp hasn't been folded
into Solaris libc, but needs to be linked with separately.
* For now, only 32-bit x86 asan is enabled on Solaris. 64-bit x86 should follow, but
sparc (which requires additional compiler-rt changes not yet submitted) fails miserably
due to a llvmsparc backend limitation:
fatal error: error in backend: Function "_ZN7testing8internal16BoolFromGTestEnvEPKcb": over-aligned dynamic alloca not supported.
However, inside the gcc tree, Solaris/sparc asan works almost as well as x86.
Reviewers: rsmith, alekseyshl
Reviewed By: alekseyshl
Subscribers: jyknight, fedor.sergeev, cfe-commits
Tags: #sanitizers
Differential Revision: https://reviews.llvm.org/D40903
llvm-svn: 324296
We now allow all signed comparisons and not equal. The complement that needs to be added for this is no worse than the extend. And the vector output forms of pcmpeq/pcmpgt have better latency than the k-register version on SKX.
llvm-svn: 324294
In the motivating case from PR35681 and represented by the macro-fuse-cmp test:
https://bugs.llvm.org/show_bug.cgi?id=35681
...there's a 37 -> 31 byte size win for the loop because we eliminate the big base
address offsets.
SPEC2017 on Ryzen shows no significant perf difference.
Differential Revision: https://reviews.llvm.org/D42607
llvm-svn: 324289
This change reduces the live range of the loaded function pointer,
resulting in a slight code size decrease (~10KB in clang), and also
improves the security of CFI for virtual calls by making it less
likely that the function pointer will be spilled, and ensuring that
it is not spilled across a function call boundary.
Fixes PR35353.
Differential Revision: https://reviews.llvm.org/D42725
llvm-svn: 324286
Summary:
Before Xcode 4.5, undefined weak symbols don't work reliably on Darwin:
https://stackoverflow.com/questions/6009321/weak-symbol-link-on-mac-os-x
Therefore this patch disables their use before Mac OS X 10.9 which is the first version
only supported by Xcode 4.5 and above.
Reviewers: glider, kcc, vitalybuka
Reviewed By: vitalybuka
Subscribers: llvm-commits
Differential Revision: https://reviews.llvm.org/D41346
llvm-svn: 324284
Summary:
This change is part of step five in the series of changes to remove alignment argument from
memcpy/memmove/memset in favour of alignment attributes. In particular, this changes the
LowerMemIntrinsics pass to cease using the old getAlignment() API of MemoryIntrinsic in
favour of getting source & dest specific alignments through the new API.
Steps:
Step 1) Remove alignment parameter and create alignment parameter attributes for
memcpy/memmove/memset. ( rL322965, rC322964, rL322963 )
Step 2) Expand the IRBuilder API to allow creation of memcpy/memmove with differing
source and dest alignments. ( rL323597 )
Step 3) Update Clang to use the new IRBuilder API. ( rC323617 )
Step 4) Update Polly to use the new IRBuilder API. ( rL323618 )
Step 5) Update LLVM passes that create memcpy/memmove calls to use the new IRBuilder API,
and those that use use MemIntrinsicInst::[get|set]Alignment() to use [get|set]DestAlignment()
and [get|set]SourceAlignment() instead. ( rL323886, rL323891, rL324148, rL324273 )
Step 6) Remove the single-alignment IRBuilder API for memcpy/memmove, and the
MemIntrinsicInst::[get|set]Alignment() methods.
Reference
http://lists.llvm.org/pipermail/llvm-dev/2015-August/089384.htmlhttp://lists.llvm.org/pipermail/llvm-commits/Week-of-Mon-20151109/312083.html
llvm-svn: 324278
Summary:
This change is part of step five in the series of changes to remove alignment argument from
memcpy/memmove/memset in favour of alignment attributes. In particular, this changes the
SimplifyLibCalls pass to cease using the old IRBuilder createMemCpy/createMemMove
single-alignment APIs in favour of the new API that allows setting source and destination
alignments independently.
Steps:
Step 1) Remove alignment parameter and create alignment parameter attributes for
memcpy/memmove/memset. ( rL322965, rC322964, rL322963 )
Step 2) Expand the IRBuilder API to allow creation of memcpy/memmove with differing
source and dest alignments. ( rL323597 )
Step 3) Update Clang to use the new IRBuilder API. ( rC323617 )
Step 4) Update Polly to use the new IRBuilder API. ( rL323618 )
Step 5) Update LLVM passes that create memcpy/memmove calls to use the new IRBuilder API,
and those that use use MemIntrinsicInst::[get|set]Alignment() to use [get|set]DestAlignment()
and [get|set]SourceAlignment() instead. ( rL323886, rL323891, r3L24148 )
Step 6) Remove the single-alignment IRBuilder API for memcpy/memmove, and the
MemIntrinsicInst::[get|set]Alignment() methods.
Reference
http://lists.llvm.org/pipermail/llvm-dev/2015-August/089384.htmlhttp://lists.llvm.org/pipermail/llvm-commits/Week-of-Mon-20151109/312083.html
llvm-svn: 324273
The major visible difference here is that in line-table dumps,
directory and file names are wrapped in double-quotes; previously,
directory names got single quotes and file names were not quoted at
all.
The improvement in this patch is that when a DWARF v5 line table
header has indirect strings, in a verbose dump these will all have
their section[offset] printed as well as the name itself. This
matches the format used for dumping strings in the .debug_info
section.
Differential Revision: https://reviews.llvm.org/D42802
llvm-svn: 324270
The 'trivial_abi' attribute can be applied to a C++ class, struct, or
union. It makes special functions of the annotated class (the destructor
and copy/move constructors) to be trivial for the purpose of calls and,
as a result, enables the annotated class or containing classes to be
passed or returned using the C ABI for the underlying type.
When a type that is considered trivial for the purpose of calls despite
having a non-trivial destructor (which happens only when the class type
or one of its subobjects is a 'trivial_abi' class) is passed to a
function, the callee is responsible for destroying the object.
For more background, see the discussions that took place on the mailing
list:
http://lists.llvm.org/pipermail/cfe-dev/2017-November/055955.htmlhttp://lists.llvm.org/pipermail/cfe-commits/Week-of-Mon-20180101/thread.html#214043
rdar://problem/35204524
Differential Revision: https://reviews.llvm.org/D41039
llvm-svn: 324269
Summary:
The 32-bit division breaks SizeClassAllocator64PopulateFreeListOOM which uses
Primary that has a maximum size > 32-bit.
Reviewers: alekseyshl
Subscribers: kubamracek, delcypher, #sanitizers, llvm-commits
Differential Revision: https://reviews.llvm.org/D42928
llvm-svn: 324268
As PR36225 shows, we definitely don't want to enable the
canEvaluate* logic with phis.
There's still a question of whether we should just revert
r324014 completely because it exposes a compile-time sinkhole
(although that problem might exist independently).
llvm-svn: 324266
Summary:
The signatures for the builtins @llvm.memcpy, @llvm.memmove, and @llvm.memset
where changed in rL322965. The number of arguments has decreased from five to
four with the removal of the alignment argument. Alignment is now conveyed
by supplying the align parameter attribute on the destination and/or source of
the cpy/move/set.
llvm-svn: 324265
When using Elf_Rela every tool should use the addend in the
relocation.
We have --apply-dynamic-relocs to work around bugs in tools that don't
do that.
The default value of --apply-dynamic-relocs should be false to make
sure these bugs are more easily found in the future.
llvm-svn: 324264
Summary:
In `ClassID`, make sure we use an unsigned as based for the `lbits` shift.
The previous code resulted in spurious sign extensions like for x64:
```
add esi, 0FFFFFFFFh
movsxd rcx, esi
and rcx, r15
```
The code with the `U` added is:
```
add esi, 0FFFFFFFFh
and rsi, r15
```
And for `MaxCachedHint`, use a 32-bit division instead of `64-bit`, which is
faster (https://lemire.me/blog/2017/11/16/fast-exact-integer-divisions-using-floating-point-operations/)
and already used in other parts of the code (64-bit `GetChunkIdx`, 32-bit
`GetMetaData` enforce 32-bit divisions)
Not major performance gains by any mean, but they don't hurt.
Reviewers: alekseyshl
Reviewed By: alekseyshl
Subscribers: kubamracek, delcypher, llvm-commits, #sanitizers
Differential Revision: https://reviews.llvm.org/D42916
llvm-svn: 324263
This allows the immediate to folded into the and instead of being forced to move into a register. This can sometimes result in shorter encodings since the and can sign extend an immediate.
This also allows us to match an and to a movzx after a not.
This can cause an extra move if the input to the separate NOT has an additional user which requires a copy before the NOT.
llvm-svn: 324260
Summary:
Here are a few improvements proposed for the local cache:
- `InitCache` always read from `per_class_[1]` in the fast path. This was not
ideal as we are working with `per_class_[class_id]`. The latter offers the
same property we are looking for (eg: `max_count != 0` means initialized),
so we might as well use it and keep our memory accesses local to the same
`per_class_` element. So change `InitCache` to take the current `PerClass`
as an argument. This also makes the fast-path assembly of `Deallocate` a lot
more compact;
- Change the 32-bit `Refill` & `Drain` functions to mimic their 64-bit
counterparts, by passing the current `PerClass` as an argument. This saves
some array computations;
- As far as I can tell, `InitCache` has no place in `Drain`: it's either called
from `Deallocate` which calls `InitCache`, or from the "upper" `Drain` which
checks for `c->count` to be greater than 0 (strictly). So remove it there.
- Move the `stats_` updates to after we are done with the `per_class_` accesses
in an attempt to preserve locality once more;
- Change some `CHECK` to `DCHECK`: I don't think the ones changed belonged in
the fast path and seemed to be overly cautious failsafes;
- Mark some variables as `const`.
The overall result is cleaner more compact fast path generated code, and some
performance gains with Scudo (and likely other Sanitizers).
Reviewers: alekseyshl
Reviewed By: alekseyshl
Subscribers: kubamracek, delcypher, #sanitizers, llvm-commits
Differential Revision: https://reviews.llvm.org/D42851
llvm-svn: 324257
Davide pointed out this would be useful if the file ever needs to be
regenerated (and I certainly agree).
I also replace the test binary with a slightly smaller one -- I intended
to do this in the original commit, but I forgot to add it to the patch
as I was juggling several things at the same time.
llvm-svn: 324256
This is the instcombine part of unsigned saturation canonicalization.
Backend patches already commited:
https://reviews.llvm.org/D37510https://reviews.llvm.org/D37534
It converts unsigned saturated subtraction patterns to forms recognized
by the backend:
(a > b) ? a - b : 0 -> ((a > b) ? a : b) - b)
(b < a) ? a - b : 0 -> ((a > b) ? a : b) - b)
(b > a) ? 0 : a - b -> ((a > b) ? a : b) - b)
(a < b) ? 0 : a - b -> ((a > b) ? a : b) - b)
((a > b) ? b - a : 0) -> - ((a > b) ? a : b) - b)
((b < a) ? b - a : 0) -> - ((a > b) ? a : b) - b)
((b > a) ? 0 : b - a) -> - ((a > b) ? a : b) - b)
((a < b) ? 0 : b - a) -> - ((a > b) ? a : b) - b)
Patch by Yulia Koval!
Differential Revision: https://reviews.llvm.org/D41480
llvm-svn: 324255
ObjectFileELF::GetModuleSpecifications contained a lot of tip-toing code
which was trying to avoid loading the full object file into memory. It
did this by trying to load data only up to the offset if was accessing.
However, in practice this was useless, as 99% of object files we
encounter have section headers at the end, so we would load the whole
file as soon as we start parsing the section headers.
In fact, this would break as soon as we encounter a file which does
*not* have section headers at the end (yaml2obj produces these), as the
access to .strtab (which we need to get the section names) was not
guarded by this offset check.
As this strategy was completely ineffective anyway, I do not attempt to
proliferate it further by guarding the .strtab accesses. Instead I just
lead the full file as soon as we are reasonably sure that we are indeed
processing an elf file.
If we really care about the load size here, we would need to reimplement
this to just load the bits of the object file we need, instead of
loading everything from the start of the object file to the given
offset. However, given that the OS will do this for us for free when
using mmap, I think think this is really necessary.
For testing this I check a (tiny) SO file instead of yaml2obj-ing it
because the fact that they come out first is an implementation detail of
yaml2obj that can change in the future.
llvm-svn: 324254
There was a logic hole in D42739 / rL324014 because we're not accounting for select and phi
instructions that might have repeated operands. This is likely a source of an infinite loop.
I haven't manufactured a test case to prove that, but it should be safe to speculatively limit
this transform to binops while we try to create that test.
llvm-svn: 324252
If the upper 32 bits of a 64 bit mask are all zeros, we have special isel patterns to use a 32-bit and instead of a 64-bit and by relying on the impliciting zeroing of 32 bit ops.
This patch teachs shrinkAndImmediate not to break that optimization.
Differential Revision: https://reviews.llvm.org/D42899
llvm-svn: 324249
This broke the Chromium build; see PR36238.
> This patch is an enhancement to propagate dbg.value information when
> Phis are created on behalf of LCSSA. I noticed a case where a value
> carried across a loop was reported as <optimized out>.
>
> Specifically this case:
>
> int bar(int x, int y) {
> return x + y;
> }
>
> int foo(int size) {
> int val = 0;
> for (int i = 0; i < size; ++i) {
> val = bar(val, i); // Both val and i are correct
> }
> return val; // <optimized out>
> }
>
> In the above case, after all of the interesting computation completes
> our value is reported as "optimized out." This change will add a
> dbg.value to correct this.
>
> This patch also moves the dbg.value insertion routine from
> LoopRotation.cpp into Local.cpp, so that we can share it in both places
> (LoopRotation and LCSSA).
>
> Patch by Matt Davis!
>
> Differential Revision: https://reviews.llvm.org/D42551
llvm-svn: 324247
Summary:
When a preprocessor indent closes after the last line of normal code we do not
correctly fixup include guard indents. For example:
#ifndef HEADER_H
#define HEADER_H
#if 1
int i;
# define A 0
#endif
#endif
incorrectly reformats to:
#ifndef HEADER_H
#define HEADER_H
#if 1
int i;
# define A 0
# endif
#endif
To resolve this issue we must fixup levels after parseFile(). Delaying
the fixup introduces a new state, so consolidate include guard search
state into an enum.
Reviewers: krasimir, klimek
Subscribers: cfe-commits
Differential Revision: https://reviews.llvm.org/D42035
llvm-svn: 324246