As noted in PR34517, the handling of signed div/rem is not on par with
unsigned div/rem. Signed is harder to reason about, but it should be
possible to handle at least some of these using the same technique that
we use for unsigned: use icmp logic to see if there's a relationship
between the quotient and divisor.
llvm-svn: 312938
Summary: To parser "include" we may need to do binary name substitution.
Reviewers: eugenis, alekseyshl
Subscribers: llvm-commits, kubamracek
Differential Revision: https://reviews.llvm.org/D37658
llvm-svn: 312933
Summary:
GEP merging can sometimes increase the number of live values and register
pressure across control edges and cause performance problems particularly if the
increased register pressure results in spills.
This change implements GEP unmerging around an IndirectBr in certain cases to
mitigate the issue. This is in the CodeGenPrepare pass (after all the GEP
merging has happened.)
With this patch, the Python interpreter loop runs faster by ~5%.
Reviewers: sanjoy, hfinkel
Reviewed By: hfinkel
Subscribers: eastig, junbuml, llvm-commits
Differential Revision: https://reviews.llvm.org/D36772
llvm-svn: 312930
The remaining parts produced by the full partial tile isolation can contain
hot spots that are worth to be optimized. Currently, we rely on the simple
loop unrolling pass, LiCM and the SLP vectorizer to optimize such parts.
However, the approach can suffer from the lack of the information about
aliasing that Polly provides using additional alias metadata or/and the lack
of the information required by simple loop unrolling pass.
This patch is the first step to optimize the remaining parts. To do it, we
unroll and separate them. In case of, for instance, Intel Kaby Lake, it helps
to increase the performance of the generated code from 39.87 GFlop/s to
49.23 GFlop/s.
The next possible step is to avoid unrolling performed by Polly in case of
isolated and remaining parts and rely only on simple loop unrolling pass and
the Loop vectorizer.
Reviewed-by: Tobias Grosser <tobias@grosser.es>
Differential Revision: https://reviews.llvm.org/D37692
llvm-svn: 312929
These two instructions are normally selected, but when the
two address pass converts mac into mad we end up with the
mad where we could have one of these.
Differential Revision: https://reviews.llvm.org/D37389
llvm-svn: 312928
When building COFF programs many targets such as mingw prefer
to have a gnu ld frontend. Rather then having a fully fledged
standalone driver we wrap a shim around the LINK driver.
Extra tests were provided by mstorsjo
Reviewers: mstorsjo, ruiu
Differential Revision: https://reviews.llvm.org/D33880
llvm-svn: 312926
Summary:
r275950 added support for turning (trunc (X >> N) to i1) into BT(X, N). But that's no longer necessary now that i1 isn't legal.
This patch removes the support for that, but preserves some of the refactorings done in that commit.
Reviewers: guyblank, RKSimon, spatel, zvi
Reviewed By: RKSimon
Subscribers: llvm-commits
Differential Revision: https://reviews.llvm.org/D37673
llvm-svn: 312925
'@' is a valid character in file paths, but the linker script tokenizer treats it
as a separate token. This was leading to an unexpected test failure, on our local
builds. This patch changes the test to quote the path to prevent this happening.
An alternative would have been to add '@' to the list of "unquoted tokens" in
ScriptLexer.cpp, but ld.bfd has the same behaviour as the current LLD.
Reviewers: ruiu
Differential Revision: https://reviews.llvm.org/D37689
llvm-svn: 312922
forgetLoop() has pretty bad performance because it goes over
the same instructions over and over again in particular when
nested loop are involved.
The refactoring changes the function to a not-recursive function
and reusing the allocation for data-structures and the Visited
set.
NFCI
Differential Revision: https://reviews.llvm.org/D37659
llvm-svn: 312920
Summary:
While `goog.setTestOnly` usually appears in the imports section of a file, it is
not actually an import, and also usually doesn't take long parameters (nor
namespaces as a parameter, it's a description/message that should be wrapped).
This fixes a regression where a `goog.setTestOnly` call nested in a function was
not wrapped.
Reviewers: djasper
Subscribers: klimek
Differential Revision: https://reviews.llvm.org/D37685
llvm-svn: 312918
When using a virtual file-system (VFS) and a preamble file (PCH) is generated,
it is generated on-disk in the real file-system instead of in the VFS (which
makes sense, since the VFS is read-only). However, when subsequently reading
the generated PCH, the frontend passes through the VFS it has been given --
resulting in an error and a failed parse (since the VFS doesn't contain the
PCH; the real filesystem does).
This patch fixes that by detecting when a VFS is being used for a parse that
needs to work with a PCH file, and creating an overlay VFS that includes the
PCH file from the real file-system.
This allows tests to be written which make use of both PCH files and a VFS.
Differential Revision: https://reviews.llvm.org/D37474
llvm-svn: 312917
Helps improve combineLogicBlendIntoPBLENDV support by allowing us to peek into through PACKSS truncations of vector comparison results.
Differential Revision: https://reviews.llvm.org/D37680
llvm-svn: 312916
A mrt exp with vm=1 must be in exact (non-WQM) mode, as it also exports
the exec mask as the valid mask to determine which pixels to render.
This commit marks any exp as needing to be in exact mode.
Actually, if there are multiple mrt exps, only one needs to have vm=1,
and only that one needs to be in exact mode. But that is an optimization
for another day.
Differential Revision: https://reviews.llvm.org/D36305
llvm-svn: 312915
Summary:
Since asan is linked dynamically on Darwin, the weak interface symbol
is removed by -Wl,-dead_strip.
Reviewers: kcc, compnerd, aaron.ballman
Subscribers: llvm-commits
Differential Revision: https://reviews.llvm.org/D37636
llvm-svn: 312914
I'm trying to refactor some shared code for integer div/rem,
but I keep having to scroll through fdiv. The FP ops have
nothing in common with the integer ops, so I'm moving FP
below everything else.
While here, improve a couple of comments and fix some formatting.
llvm-svn: 312913
This check is relatively simple, and is often being used
as an example. I'm aware of at least two cases, when
simply copying the FunctionASTVisitor class to a new
check resulted in a rather unobvious segfault. Having it
in anonymous namespace prevents such a problem.
No functionality change, so i opted to avoid phabricator,
especially since clang-tidy reviews are seriously jammed.
llvm-svn: 312912
Summary:
This causes template arguments to be traversed before the templated
declaration, which is useful for clients that expect the nodes in
the same order as they are in the source code. Additionally, there
seems to be no good reason not to do so.
This was moved here from LexicallyOrderedRecursiveASTVisitor. The tests
still reside in LexicallyOrderedRecursiveASTVisitorTest.cpp under
VisitTemplateDecls.
Reviewers: arphaman, rsmith, klimek
Subscribers: cfe-commits, klimek
Differential Revision: https://reviews.llvm.org/D37662
llvm-svn: 312911
Suggested in D37680
Note: had to drop AVX512VL tests as there is an infinite loop in the new tests that needs further investigation (not relevant to D37680).
llvm-svn: 312910
NFC.
Updated 3 Codegen regression tests to use the -mattr flag instead of the -mcpu flags as follows:
Instead of -mcpu=skx use -mattr=+avx512f,+avx512bw,+avx512vl,+avx512dq
Instead of -mcpu=knl use -mattr=+avx512f
Reviewers: delena
Revision: https://reviews.llvm.org/D37674
llvm-svn: 312909
Also enables '__do_clear_bss'.
These functions are automaticalled called by the CRT if they are
declared.
We need these to be called otherwise RAM will start completely
uninitialised, even though we need to copy RAM variables from progmem to
RAM.
llvm-svn: 312905
Summary:
**Short overview:**
Fixed bug: https://bugs.llvm.org/show_bug.cgi?id=34001
Clang-format bug resulting in a strange behavior of control statements short blocks. Different flags combinations do not guarantee expected result. Turned on option AllowShortBlocksOnASingleLine does not work as intended.
**Description of the problem:**
Cpp source file UnwrappedLineFormatter does not handle AllowShortBlocksOnASingleLine flag as it should. Putting a single-line control statement without any braces, clang-format works as expected (depending on AllowShortIfStatementOnASingleLine or AllowShortLoopsOnASingleLine value). Putting a single-line control statement in braces, we can observe strange and incorrect behavior.
Our short block is intercepted by tryFitMultipleLinesInOne function. The function returns a number of lines to be merged. Unfortunately, our control statement block is not covered properly. There are several if-return statements, but none of them handles our block. A block is identified by the line first token and by left and right braces. A function block works as expected, there is such an if-return statement doing proper job. A control statement block, from the other hand, falls into strange conditional construct, which depends on BraceWrapping.AfterFunction flag (with condition that the line’s last token is left brace, what is possible in our case) or goes even further. That should definitely not happen.
**Description of the patch:**
By adding three different if statements, we guarantee that our short control statement block, however it looks like (different brace wrapping flags may be turned on), is handled properly and does not fall into wrong conditional construct. Depending on appropriate options we return either 0 (when something disturbs our merging attempt) or let another function (tryMergeSimpleBlock) take the responsibility of returned result (number of merged lines). Nevertheless, one more correction is required in mentioned tryMergeSimpleBlock function. The function, previously, returned either 0 or 2. The problem was that this did not handle the case when our block had the left brace in a separate line, not the header one. After change, after adding condition, we return the result compatible with block’s structure. In case of left brace in the header’s line we do everything as before the patch. In case of left brace in a separate line we do the job similar to the one we do in case of a “non-header left brace” function short block. To be precise, we try to merge the block ignoring the header line. Then, if success, we increment our returned result.
**After fix:**
**CONFIG:**
```
AllowShortBlocksOnASingleLine: true
AllowShortIfStatementsOnASingleLine: true
BreakBeforeBraces: Custom
BraceWrapping: {
AfterClass: true, AfterControlStatement: true, AfterEnum: true, AfterFunction: true, AfterNamespace: false, AfterStruct: true, AfterUnion: true, BeforeCatch: true, BeforeElse: true
}
```
**BEFORE:**
```
if (statement) doSomething();
if (statement) { doSomething(); }
if (statement) {
doSomething();
}
if (statement)
{
doSomething();
}
if (statement)
doSomething();
if (statement) {
doSomething1();
doSomething2();
}
```
**AFTER:**
```
if (statement) doSomething();
if (statement) { doSomething(); }
if (statement) { doSomething(); }
if (statement) { doSomething(); }
if (statement) doSomething();
if (statement)
{
doSomething1();
doSomething2();
}
```
Contributed by @PriMee!
Reviewers: krasimir, djasper
Reviewed By: krasimir
Subscribers: cfe-commits, klimek
Differential Revision: https://reviews.llvm.org/D37140
llvm-svn: 312904
This patch will introduce even more aliases for the hicpp-module to already existing
checks and is a follow up for D30383 finishing the other sections.
It fixes a forgotten highlight in hicpp-braces-around-statements.rst, too.
llvm-svn: 312901
This is a preparatory step for D34515 and also is being recommitted as its
first version caused PR34045.
This change:
- makes nodes ISD::ADDCARRY and ISD::SUBCARRY legal for i32
- lowering is done by first converting the boolean value into the carry flag
using (_, C) ← (ARMISD::ADDC R, -1) and converted back to an integer value
using (R, _) ← (ARMISD::ADDE 0, 0, C). An ARMISD::ADDE between the two
operations does the actual addition.
- for subtraction, given that ISD::SUBCARRY second result is actually a
borrow, we need to invert the value of the second operand and result before
and after using ARMISD::SUBE. We need to invert the carry result of
ARMISD::SUBE to preserve the semantics.
- given that the generic combiner may lower ISD::ADDCARRY and
ISD::SUBCARRYinto ISD::UADDO and ISD::USUBO we need to update their lowering
as well otherwise i64 operations now would require branches. This implies
updating the corresponding test for unsigned.
- add new combiner to remove the redundant conversions from/to carry flags
to/from boolean values (ARMISD::ADDC (ARMISD::ADDE 0, 0, C), -1) → C
- fixes PR34045
Differential Revision: https://reviews.llvm.org/D35192
llvm-svn: 312898
After the split of the Scatter operation, the order of the new instructions is well defined - Lo goes before Hi. Otherwise the semantic of Scatter (from LSB to MSB) is broken.
I'm chaining 2 nodes to prevent reordering.
Differential Revision https://reviews.llvm.org/D37670
llvm-svn: 312894
This patch fixes llvm.org/PR34298. Previously libc++ incorrectly evaluated
the __invokable trait via the converting constructor `function(Tp)` [with Tp = std::function]
whenever the copy constructor or copy assignment operator
was required. This patch further constrains that constructor to short
circut before evaluating the troublesome SFINAE when `Tp` matches
std::function.
The original patch is from Alex Lorenz.
llvm-svn: 312892
This reverts commit r312890 because the test case fails to compile for
older versions of Clang that reject initializing a const object without
a user defined constructor.
Since this patch should go into 5.0.1, I want to keep it an atomic change,
and will re-commit it with a fixed test case.
llvm-svn: 312891
This patch fixes llvm.org/PR34298. Previously libc++ incorrectly evaluated
the __invokable trait via the converting constructor `function(Tp)` [with Tp = std::function]
whenever the copy constructor or copy assignment operator
was required. This patch further constrains that constructor to short
circut before evaluating the troublesome SFINAE when `Tp` matches
std::function.
The original patch is from Alex Lorenz.
llvm-svn: 312890
This removes some duplicated code and makes it easier to support signed div/rem
in a similar way if we want to do that. Note that the existing comments were not
accurate - we don't need a constant divisor to simplify; icmp simplification does
more than that. But as the added tests show, it could go even further.
llvm-svn: 312885
First step towards making it possible to use the shuffle combines for cases where we don't want to call DCI.CombineTo() with the result.
llvm-svn: 312884