This enables peeling of loops with low dynamic iteration count by default,
when profile information is available.
Differential Revision: https://reviews.llvm.org/D27734
llvm-svn: 295796
declaration declared using class template argument deduction.
Patch by Eric Fiselier (who is busy and asked me to commit this on his behalf)!
Differential Revision: https://reviews.llvm.org/D30082
llvm-svn: 295794
I added this log message to test the /msvclto option, but
this output might confuse FileCheck. This patch attempts to fix
it by removing it.
llvm-svn: 295793
We need to look through the PackExpansionType in the parameter type when
deducing, and we need to consider the possibility of deducing arguments for
packs that are not lexically mentioned in the pattern (but are nonetheless
deducible) when figuring out which packs are covered by a pack deduction scope.
llvm-svn: 295790
Change implementation to use max instead of add.
min/max/med3 do not flush denormals regardless of the mode,
so it is OK to use it whether or not they are enabled.
Also allow using clamp with f16, and use knowledge
of dx10_clamp.
llvm-svn: 295788
LLD is a multi-threaded program. errs() or outs() are not guaranteed
to be thread-safe (they are actually not).
LLD's message(), log() or error() are thread-safe. We should use them.
llvm-svn: 295787
Original code only used vector loads/stores for explicit vector arguments.
It could also do more loads/stores than necessary (e.g v5f32 would
touch 8 f32 values). Aggregate types were loaded one element at a time,
even the vectors contained within.
This change attempts to generalize (and simplify) parameter space
loads/stores so that vector loads/stores can be used more broadly.
Functionality of the patch has been verified by compiling thrust
test suite and manually checking the differences between PTX
generated by llvm with and without the patch.
General algorithm:
* ComputePTXValueVTs() flattens input/output argument into a flat list
of scalars to load/store and returns their types and offsets.
* VectorizePTXValueVTs() uses that data to create vectorization plan
which returns an array of flags marking boundaries of vectorized
load/stores. Scalars are represented as 1-element vectors.
* Code that generates loads/stores implements a simple state machine
that constructs a vector according to the plan.
Differential Revision: https://reviews.llvm.org/D30011
llvm-svn: 295784
Summary:
POSIX requires lgamma writes to an external global variable, signgam.
This prevents annotating lgamma with readnone, which is incorrect on
targets that write to signgam.
Reviewers: efriedma, rsmith
Subscribers: cfe-commits
Differential Revision: https://reviews.llvm.org/D29778
llvm-svn: 295781
This change exposes the symbol table insert method and uses it to
insert the linkerscript defined symbols directly into the symbol
table to avoid unnecessarily pulling the object out of an archive.
Differential Revision: https://reviews.llvm.org/D30224
llvm-svn: 295780
Summary: This is a patch for PR31836. As the bug replaces the path separators in the included file name with the characters following them, the test script makes sure that there's no "Ccase-insensitive-include-pr31836.h" in the warning message.
Reviewers: rsmith, eric_niebler
Reviewed By: eric_niebler
Subscribers: karies, cfe-commits
Differential Revision: https://reviews.llvm.org/D30000
llvm-svn: 295779
Summary: I'm not sure why they were in different files, but it's kind of harder to maintain. I create this patch partially for initiate a discussion.
Reviewers: dberris
Subscribers: nemanjai, cfe-commits
Differential Revision: https://reviews.llvm.org/D30118
llvm-svn: 295778
Summary:
If symbolizer was instrumented with sanitizer and crash, it may
try to call itself again causing infinite recursion of crashing processes.
Reviewers: eugenis
Subscribers: kubamracek, llvm-commits, dberris
Differential Revision: https://reviews.llvm.org/D30222
llvm-svn: 295771
Since I'm only seeing failures on OSX, and it's saying
permission denied, I'm suspecting this is due to the addition
of the MAP_RESILIENT_CODESIGN and/or MAP_RESILIENT_MEDIA flags.
Speculatively trying to remove those to get the bots working.
llvm-svn: 295770
This test has been reverted in r279918 due to flaky atos support in the OS
some machines in the buildbot fleet were running. This should not be a
problem anymore.
llvm-svn: 295766
For whatever reason ld64 requires that member headers (not the member
themselves) should be aligned. The only way to do that is to edit the
previous member so that it ends at an aligned boundary.
Since modifying data put in an archive is an undesirable property,
llvm-ar should only do it when it is absolutely necessary.
llvm-svn: 295765
Summary: AddDiscriminator pass is only useful for sample pgo. This patch restricts AddDiscriminator to -fdebug-info-for-profiling so that it does not introduce unecessary debug size increases for non-sample-pgo builds.
Reviewers: dblaikie, aprantl
Reviewed By: dblaikie
Subscribers: cfe-commits
Differential Revision: https://reviews.llvm.org/D30220
llvm-svn: 295764
Summary:
The DLL thunks are stubs added to an instrumented DLL to redirect ASAN API calls
to the real ones in the main executable. These thunks must contain dummy
code before __asan_init got called. Unfortunately, MSVC linker is doing ICF and is
merging functions with the same body.
In our case, this two ASAN thunks were incorrectly merged:
```
asan_interface.inc:16
INTERFACE_FUNCTION(__asan_before_dynamic_init)
```
```
sanitizer_common_interface.inc:16
INTERFACE_FUNCTION(__sanitizer_verify_contiguous_container)
```
The same thunk got patched twice. After the second patching, calls to
`__asan_before_dynamic_init` are redirected to `__sanitizer_verify_contiguous_container`
and trigger a DCHECK on incorrect operands/
The problem was caused by the macro that is only using __LINE__ to prevent
collapsing code.
```
#define INTERCEPT_SANITIZER_FUNCTION(name)
extern "C" __declspec(noinline) void name() {
volatile int prevent_icf = (__LINE__ << 8); (void)prevent_icf;
```
The current patch is adding __COUNTER__ which is safer than __LINE__.
Also, to precent ICF (guarantee that code is different), we are using a unique attribute:
- the name of the function
Reviewers: rnk
Reviewed By: rnk
Subscribers: llvm-commits, kubamracek, chrisha, dberris
Differential Revision: https://reviews.llvm.org/D30219
llvm-svn: 295761
This is part of trying to clean up our handling of min/max patterns in IR.
By converting these to canonical form, we're more likely to recognize them
because there are various places in InstCombine that don't use
matchSelectPattern or m_SMax and friends.
The backend fixups referenced in the now deleted TODO comment were added with:
https://reviews.llvm.org/rL291392https://reviews.llvm.org/rL289738
If there's any codegen fallout from this change, we should be able to address
it in DAGCombiner or target-specific lowering.
llvm-svn: 295758
There are still over 3400 files remaining with this property set, but there are tens of thousands more with the property not set. Until we decide what to do on a global scale, this at least unblocks me temporarily.
llvm-svn: 295756
Before frame offsets are calculated, try to eliminate the
frame indexes used by SGPR spills. Then we can delete them
after.
I think for now we can be sure that no other instruction
will be re-using the same frame indexes. It should be easy
to notice if this assumption ever breaks since everything
asserts if it tries to use a dead frame index later.
The unused emergency stack slot seems to still be left behind,
so an additional 4 bytes is still wasted.
llvm-svn: 295753
Conflicting debug info for function arguments causes hard-to-debug
assertions in the DWARF backend, so the Verifier should reject it.
For performance reasons this only checks function arguments from
non-inlined debug intrinsics for now.
rdar://problem/30520286
llvm-svn: 295749
Summary:
Rework the code that was sinking/duplicating (icmp and, 0) sequences
into blocks where they were being used by conditional branches to form
more tbz instructions on AArch64. The new code is more general in that
it just looks for 'and's that have all icmp 0's as users, with a target
hook used to select which subset of 'and' instructions to consider.
This change also enables 'and' sinking for X86, where it is more widely
beneficial than on AArch64.
The 'and' sinking/duplicating code is moved into the optimizeInst phase
of CodeGenPrepare, where it can take advantage of the fact the
OptimizeCmpExpression has already sunk/duplicated any icmps into the
blocks where they are used. One minor complication from this change is
that optimizeLoadExt needed to be updated to always mark 'and's it has
determined should be in the same block as their feeding load in the
InsertedInsts set to avoid an infinite loop of hoisting and sinking the
same 'and'.
This change fixes a regression on X86 in the tsan runtime caused by
moving GVNHoist to a later place in the optimization pipeline (see
PR31382).
Reviewers: t.p.northover, qcolombet, MatzeB
Subscribers: aemerson, mcrosier, sebpop, llvm-commits
Differential Revision: https://reviews.llvm.org/D28813
llvm-svn: 295746
To avoid depending on kernel headers, we just repeat the single define
we need, which is likely never going to change.
Patch by Joakim Sindholt <opensource@zhasha.com>
llvm-svn: 295738