This is based on the example/discussion in PR37428:
https://bugs.llvm.org/show_bug.cgi?id=37428
Proper vector shift instructions don't appear until AVX2, so we may generate several
extra instructions within a loop trying to compensate for that. It's difficult to
recover from that shift expansion later than this, so use the existing TLI hook and
splat analysis to enable better codegen.
This extends CGP functionality introduced with:
rL201655
Differential Revision: https://reviews.llvm.org/D63233
llvm-svn: 363511
Often times, when an ArraySubscriptExpr was reported as null or
undefined, the bug report was difficult to understand, because the
analyzer explained why arr[i] has that value, but didn't realize that in
fact i's value is very important as well. This patch fixes this by
tracking the indices of arrays.
Differential Revision: https://reviews.llvm.org/D63080
llvm-svn: 363510
This is similar logic/motivation to the select splitting in D62969.
In D63233, the pattern changes so that we no longer have an extract_subvector of vselect,
but the operands of the select are still being concatenated.
The closest case is represented in either the first or last test diffs here - we have an
extra instruction, but we converted 3-4 ymm instructions into 4-5 xmm instructions.
I think that's the right trade-off for most AVX1 targets.
In the example based on PR37428:
https://bugs.llvm.org/show_bug.cgi?id=37428
...this makes the loop about 30% faster (tested on Haswell by compiling with -mavx).
Differential Revision: https://reviews.llvm.org/D63364
llvm-svn: 363508
Summary:
This builds on the relations support added in D59407, D62459, D62471,
and D62839 to implement type hierarchy subtypes.
Reviewers: kadircet
Subscribers: ilya-biryukov, ioeric, MaskRay, jkorous, mgrang, arphaman,
jdoerfert, cfe-commits
Tags: #clang
Differential Revision: https://reviews.llvm.org/D58880
llvm-svn: 363506
Was reverted in r363379 due to build breakage.
Thanks to Nico Weber for reverting the original and suggesting the
fix.
Please see https://reviews.llvm.org/D61697
llvm-svn: 363502
Summary:
With Split DWARF the resulting object file (then called skeleton CU)
contains the file name of another ("DWO") file with the debug info.
This can be a problem for remote compilation, as it will contain the
name of the file on the compilation server, not on the client.
To use Split DWARF with remote compilation, one needs to either
* make sure only relative paths are used, and mirror the build directory
structure of the client on the server,
* inject the desired file name on the client directly.
Since llc already supports the latter solution, we're just copying that
over. We allow setting the actual output filename separately from the
value of the DW_AT_[GNU_]dwo_name attribute in the skeleton CU.
Fixes PR40276.
Reviewers: dblaikie, echristo, tejohnson
Reviewed By: dblaikie
Differential Revision: https://reviews.llvm.org/D59673
llvm-svn: 363496
Summary:
If the nested loop is an innermost loop, prefer to a 32-byte alignment, so that
we can decrease cache misses and branch-prediction misses. Actual alignment of
the loop will depend on the hotness check and other logic in alignBlocks.
The old code will only align hot loop to 32 bytes when the LoopSize larger than
16 bytes and smaller than 32 bytes, this patch will align the innermost hot loop
to 32 bytes not only for the hot loop whose size is 16~32 bytes.
Reviewed By: steven.zhang, jsji
Differential Revision: https://reviews.llvm.org/D61228
llvm-svn: 363495
Summary:
This is the first in a series of changes trying to align clang -cc1
flags for Split DWARF with those of llc. The unfortunate side effect of
having -split-dwarf-output for single file Split DWARF will disappear
again in a subsequent change.
The change is the result of a discussion in D59673.
Reviewers: dblaikie, echristo
Reviewed By: dblaikie
Differential Revision: https://reviews.llvm.org/D63130
llvm-svn: 363494
Summary:
When using ConstantExpr we often need the result of the expression to be kept in the AST. Currently this is done on a by the node that needs the result and has been done multiple times for enumerator, for constexpr variables... . This patch adds to ConstantExpr the ability to store the result of evaluating the expression. no functional changes expected.
Changes:
- Add trailling object to ConstantExpr that can hold an APValue or an uint64_t. the uint64_t is here because most ConstantExpr yield integral values so there is an optimized layout for integral values.
- Add basic* serialization support for the trailing result.
- Move conversion functions from an enum to a fltSemantics from clang::FloatingLiteral to llvm::APFloatBase. this change is to make it usable for serializing APValues.
- Add basic* Import support for the trailing result.
- ConstantExpr created in CheckConvertedConstantExpression now stores the result in the ConstantExpr Node.
- Adapt AST dump to print the result when present.
basic* : None, Indeterminate, Int, Float, FixedPoint, ComplexInt, ComplexFloat,
the result is not yet used anywhere but for -ast-dump.
Reviewers: rsmith, martong, shafik
Reviewed By: rsmith
Subscribers: rnkovacs, hiraditya, dexonsmith, cfe-commits, llvm-commits
Tags: #clang, #llvm
Differential Revision: https://reviews.llvm.org/D62399
llvm-svn: 363493
Summary:
When we traversed backwards on ExplodedNodes to see where processed the
given statement we `break` too early. With the current approach we do not
miss the CallExitEnd ProgramPoint which stands for an inlined call.
Reviewers: NoQ, xazax.hun, ravikandhadai, baloghadamsoftware, Szelethus
Reviewed By: NoQ
Subscribers: szepet, rnkovacs, a.sidorin, mikhail.ramalho, donat.nagy,
dkrupp, cfe-commits
Tags: #clang
Differential Revision: https://reviews.llvm.org/D62926
llvm-svn: 363491
Based on D59959, this switches SCEV to use unsigned/signed range
intersection based on the sign hint. This will prefer non-wrapping
ranges in the relevant domain. I've left the one intersection in
getRangeForAffineAR() to use the smallest intersection heuristic,
as there doesn't seem to be any obvious preference there.
Differential Revision: https://reviews.llvm.org/D60035
llvm-svn: 363490
If we can detect that saturating math that depends on an IV cannot
overflow, replace it with simple math. This is similar to the CVP
optimization from D62703, just based on a different underlying
analysis (SCEV vs LVI) that catches different cases.
Differential Revision: https://reviews.llvm.org/D62792
llvm-svn: 363489
Summary:
Since the addition of __builtin_is_constant_evaluated the result of an expression can change based on whether it is evaluated in constant context. a lot of semantic checking performs evaluations with out specifying context. which can lead to wrong diagnostics.
for example:
```
constexpr int i0 = (long long)__builtin_is_constant_evaluated() * (1ll << 33); //#1
constexpr int i1 = (long long)!__builtin_is_constant_evaluated() * (1ll << 33); //#2
```
before the patch, #2 was diagnosed incorrectly and #1 wasn't diagnosed.
after the patch #1 is diagnosed as it should and #2 isn't.
Changes:
- add a flag to Sema to passe in constant context mode.
- in SemaChecking.cpp calls to Expr::Evaluate* are now done in constant context when they should.
- in SemaChecking.cpp diagnostics for UB are not checked for in constant context because an error will be emitted by the constant evaluator.
- in SemaChecking.cpp diagnostics for construct that cannot appear in constant context are not checked for in constant context.
- in SemaChecking.cpp diagnostics on constant expression are always emitted because constant expression are always evaluated.
- semantic checking for initialization of constexpr variables is now done in constant context.
- adapt test that were depending on warning changes.
- add test.
Reviewers: rsmith
Reviewed By: rsmith
Subscribers: cfe-commits
Tags: #clang
Differential Revision: https://reviews.llvm.org/D62009
llvm-svn: 363488
The default nm executable may not be able to handle the architecture
we're building the sanitizers for. Respect CMAKE_NM if it's set to
ensure we're using the correct nm tool. Preserve the existing NM
environment variable override to not break its users.
Differential Revision: https://reviews.llvm.org/D63368
llvm-svn: 363483
This reverts rL363474. -debug-only=isel was added to some tests that
don't specify `REQUIRES: asserts`. This causes failures on
-DLLVM_ENABLE_ASSERTIONS=off builds.
I chose to revert instead of fixing the tests because I'm not sure
whether we should add `REQUIRES: asserts` to more tests.
llvm-svn: 363482
Summary:
This builds on the relations support added in D59407, D62459,
and D62471 to expose relations in SymbolIndex and its
implementations.
Reviewers: kadircet
Subscribers: ilya-biryukov, MaskRay, jkorous, arphaman, cfe-commits
Tags: #clang
Differential Revision: https://reviews.llvm.org/D62839
llvm-svn: 363481
Add constraints for the test that require specific backend targets
to be registered.
Remove trailing whitespace in the doc.
Differential Revision: https://reviews.llvm.org/D63105
llvm-svn: 363475
Current findBestLoopTop can find and move one kind of block to top, a latch block has one successor. Another common case is:
* a latch block
* it has two successors, one is loop header, another is exit
* it has more than one predecessors
If it is below one of its predecessors P, only P can fall through to it, all other predecessors need a jump to it, and another conditional jump to loop header. If it is moved before loop header, all its predecessors jump to it, then fall through to loop header. So all its predecessors except P can reduce one taken branch.
Differential Revision: https://reviews.llvm.org/D43256
llvm-svn: 363471
with 'objc_arc_inert'
Those calls are no-ops, so they can be safely deleted.
rdar://problem/49839633
Differential Revision: https://reviews.llvm.org/D62433
llvm-svn: 363468
'objc_arc_inert'
The attribute enables the ARC optimizer to delete ObjC ARC runtime calls
on the annotated globals (see https://reviews.llvm.org/D62433). We
currently only annotate global variables for string literals and global
blocks with the attribute.
rdar://problem/49839633
Differential Revision: https://reviews.llvm.org/D62831
llvm-svn: 363467
Currently you get extra waits, because waits are inserted for the
register dependencies of the call, and the function prolog waits on
everything.
Currently waits are still inserted on returns. It may make sense to
not do this, and wait in the caller instead.
llvm-svn: 363465
This patch allows clang users to print out a list of supported CPU models using
clang [--target=<target triple>] --print-supported-cpus
Then, users can select the CPU model to compile to using
clang --target=<triple> -mcpu=<model> a.c
It is a handy feature to help cross compilation.
llvm-svn: 363464