While this change didn't really hurt, it does lead to spurious
warnings about not being able to override weak symbols if you end up
linking objects built with this change to ones built without it.
Furthermore, since __call_once_proxy is called indirectly anyway it
doesn't actually inline ever.
Longer term, it would probably make sense to give this symbol internal
visibility instead.
This reverts r291497
llvm-svn: 293581
For now just port some of the existing NVPTX tests
and from an old HSAIL optimization pass which
approximately did the same thing.
Don't enable the pass yet until more testing is done.
llvm-svn: 293580
Use the qualified name for StringLiteral (llvm::StringLiteral) when
generating the sources. This is needed as the generated files may be
used out-of-tree (e.g. swift) where you may not have a
`using namespace llvm;` resulting in an undefined lookup.
llvm-svn: 293577
Make SolveLinEquationWithOverflow take the start as a SCEV, so we can
solve more cases. With that implemented, get rid of the special case
for powers of two.
The additional functionality probably isn't particularly useful,
but it might help a little for certain cases involving pointer
arithmetic.
Differential Revision: https://reviews.llvm.org/D28884
llvm-svn: 293576
This re-applies r293343 (reverts commit r293475) with a fix for an
assertion failure caused by a missing integer cast. I tested this patch
by using the built compiler to compile X86FastISel.cpp.o with ubsan.
Original commit message:
Ubsan does not report UB shifts in some cases where the shift exponent
needs to be truncated to match the type of the shift base. We perform a
range check on the truncated shift amount, leading to false negatives.
Fix the issue (PR27271) by performing the range check on the original
shift amount.
Differential Revision: https://reviews.llvm.org/D29234
llvm-svn: 293572
Summary:
In revision rL278321, ExecutionDepsFix learned how to pick a better
register for undef register reads, e.g. for instructions such as
`vcvtsi2sdq`. While this revision improved performance on a good number
of our benchmarks, it unfortunately also caused significant regressions
(up to 3x) on others. This regression turned out to be caused by loops
such as:
PH -> A -> B (xmm<Undef> -> xmm<Def>) -> C -> D -> EXIT
^ |
+----------------------------------+
In the previous version of the clearance calculation, we would visit
the blocks in order, remembering for each whether there were any
incoming backedges from blocks that we hadn't processed yet and if
so queuing up the block to be re-processed. However, for loop structures
such as the above, this is clearly insufficient, since the block B
does not have any unknown backedges, so we do not see the false
dependency from the previous interation's Def of xmm registers in B.
To fix this, we need to consider all blocks that are part of the loop
and reprocess them one the correct clearance values are known. As
an optimization, we also want to avoid reprocessing any later blocks
that are not part of the loop.
In summary, the iteration order is as follows:
Before: PH A B C D A'
Corrected (Naive): PH A B C D A' B' C' D'
Corrected (w/ optimization): PH A B C A' B' C' D
To facilitate this optimization we introduce two new counters for each
basic block. The first counts how many of it's predecssors have
completed primary processing. The second counts how many of its
predecessors have completed all processing (we will call such a block
*done*. Now, the criteria to reprocess a block is as follows:
- All Predecessors have completed primary processing
- For x the number of predecessors that have completed primary
processing *at the time of primary processing of this block*,
the number of predecessors that are done has reached x.
The intuition behind this criterion is as follows:
We need to perform primary processing on all predecessors in order to
find out any direct defs in those predecessors. When predecessors are
done, we also know that we have information about indirect defs (e.g.
in block B though that were inherited through B->C->A->B). However,
we can't wait for all predecessors to be done, since that would
cause cyclic dependencies. However, it is guaranteed that all those
predecessors that are prior to us in reverse postorder will be done
before us. Since we iterate of the basic blocks in reverse postorder,
the number x above, is precisely the count of the number of predecessors
prior to us in reverse postorder.
Reviewers: myatsina
Differential Revision: https://reviews.llvm.org/D28759
llvm-svn: 293571
Create a WasmDumper subclass of ObjDumper to support Webassembly binary
files.
Patch by Sam Clegg
Differential Revision: https://reviews.llvm.org/D27355
llvm-svn: 293569
This fixes various ways to tickle an assertion in constant expression
evaluation when using __int128. Longer term, we need to figure out what should
happen here: either any kind of overflow in offset calculation should result in
a non-constant value or we should truncate to 64 bits. In C++11 onwards, we're
effectively already checking for overflow because we strictly enforce array
bounds checks, but even there some forms of overflow can slip past undetected.
llvm-svn: 293568
- Move DEBUG_TYPE below includes
- Change unknown address space constant to be consistent with other
passes
- Grammar fixes in debug output
llvm-svn: 293567
combineX86ShufflesRecursively can still only handle a maximum of 2 shuffle inputs but everything before it now supports any number of shuffle inputs.
This will be necessary for combining OR(SHUFFLE, SHUFFLE) patterns.
llvm-svn: 293560
Summary: SamplePGO needs to check if it is legal to promote a target before it actually promotes it.
Reviewers: davidxl
Reviewed By: davidxl
Subscribers: llvm-commits
Differential Revision: https://reviews.llvm.org/D29306
llvm-svn: 293559
Summary:
For a reason that hasn't been investigated for lack of powerpc knowledge and
hardware, -fno-function-sections is required for the Sanitizers to work
properly on powerpc64le. Without, the function-sections-are-bad test fails on
that architecture (and that architecture only).
This patch re-enables the flag in the powerpc64le cflags.
I have to admit I am not entirely sure if my way is the proper way to do this,
so if anyone has a better way, I'll be happy to oblige.
Reviewers: kcc, eugenis
Reviewed By: eugenis
Subscribers: nemanjai, mgorny, llvm-commits
Differential Revision: https://reviews.llvm.org/D29285
llvm-svn: 293558
Just a small clean up noticed when doing post-commit review of Duncan's
previous change for ModuleFile memory ownership semantics. NFC.
llvm-svn: 293556
Since we have no call support and late linking we can produce code
only for used symbols. This saves compilation time, size of the final
executable, and size of any intermediate dumps.
Run Internalize pass early in the opt pipeline followed by global
DCE pass. To enable it RT can pass -amdgpu-internalize-symbols option.
Differential Revision: https://reviews.llvm.org/D29214
llvm-svn: 293549
Summary:
Consider formatting the following code fragment with column limit 20:
```
{
// line 1
// line 2\
// long long long line
}
```
Before this fix the output is:
```
{
// line 1
// line 2\
// long long
long line
}
```
This patch fixes a regression that breaks the last comment line without
adding the '//' prefix.
Reviewers: djasper
Reviewed By: djasper
Subscribers: cfe-commits, klimek
Differential Revision: https://reviews.llvm.org/D29298
llvm-svn: 293548
To better match the old darwin otool(1) behavior, when llvm-obdump(1) is used
with the -macho option and the input file is not an object file simply print
the file name and this message:
foo: is not an object file
and continue on to process other input files. Also in this case don’t exit
non-zero. This should help in some OSS projects' with autoconf scripts
that are expecting the old darwin otool(1) behavior.
rdar://26828015
llvm-svn: 293547
For some reason the exception selector register must be a pointer (that's
assumed by SDag); on the other hand, it gets moved into an IR-level type which
might be entirely different (i32 on AArch64). IRTranslator needs to be aware of
this.
llvm-svn: 293546
For targets with different addressing modes in each address space,
if this is dropped querying isLegalAddressingMode later with this
will give a nonsense result, breaking the isLegalUse assertions.
This is a candidate for the 4.0 release branch.
llvm-svn: 293542
This is worse if the original constant is an inline immediate.
This should also be done for 64-bit adds, but requires fixing
operand folding bugs first.
llvm-svn: 293540
Summary:
The following two comment lines form a single comment section:
```
if (1) { // line 1
// line 2
}
```
This is because the break of a comment section was based on the original column
of the first token of the previous line (in this case, the 'if').
This patch splits these two comment lines into different sections by taking into
account the original column of the right brace preceding the first line comment
where applicable.
Reviewers: djasper
Reviewed By: djasper
Subscribers: cfe-commits, klimek
Differential Revision: https://reviews.llvm.org/D29291
llvm-svn: 293539
macOS
Summary:
In https://bugs.freebsd.org/215125 I was notified that some configure
scripts attempt to test for the Linux-specific `mallinfo` and `mallopt`
functions by compiling and linking small programs which references the
functions, and observing whether that results in errors.
FreeBSD and macOS do not have the `mallinfo` and `mallopt` functions, so
normally these tests would fail, but when sanitizers are enabled, they
incorrectly succeed, because the sanitizers define interceptors for
these functions. This also applies to some other malloc-related
functions, such as `memalign`, `pvalloc` and `cfree`.
Fix this by not intercepting `mallinfo`, `mallopt`, `memalign`,
`pvalloc` and `cfree` for FreeBSD and macOS, in all sanitizers.
Also delete the non-functional `cfree` wrapper for Windows, to fix the
test cases on that platform.
Reviewers: emaste, kcc, rnk
Subscribers: timurrrr, eugenis, hans, joerg, llvm-commits, kubamracek
Differential Revision: https://reviews.llvm.org/D27654
llvm-svn: 293536
Tablegen emitted a warning when the fast isel emitter created dead
code by emitting a pattern that has no predicate before a pattern
that has one.
This should be an error but was originally only a warning because the X86
backend had a buggy definition that unintentionally caused this to be hit
(PR21575). That has been fixed a while ago (r222094), so it's safe to
upgrade the warning to an error.
llvm-svn: 293534