This patch continues a series of patches started by r332907 (reapplied
as r332917).
In this commit we sort rules within their 2nd level by the type check
on def operand of the root instruction, which allows for better
nesting grouping on the level.
This is expected to decrease time GlobalISel spends in its
InstructionSelect pass by roughly 22% for an -O0 build as measured on
sqlite3-amalgamation (http://sqlite.org/download.html) targeting
AArch64 (cross-compile on x86).
Reviewers: qcolombet, dsanders, bogner, aemerson, javed.absar
Reviewed By: qcolombet
Subscribers: rovka, llvm-commits, kristof.beyls
Differential Revision: https://reviews.llvm.org/D44700
llvm-svn: 333131
unusual types.
Following the observed behavior of GCC, we now return -1 for vector types
(along with all of our extensions that GCC doesn't support), and for atomic
types we classify the underlying type.
GCC appears to have changed its classification for function and array arguments
between version 5 and version 6. Previously it would classify them as pointers
in C and as functions or arrays in C++, but from version 6 onwards, it
classifies them as pointers. We now follow the more recent GCC behavior rather
than emulating what I can only assume to be a historical bug in their C++
support for this builtin.
Finally, no version of GCC that I can find has ever used the "method"
classification for C++ pointers to member functions. Instead, GCC classifies
them as record types, presumably reflecting an internal implementation detail,
but whatever the reason we now produce compatible results.
llvm-svn: 333126
by replacing DenseMap with IndexedMap for LLTs within MRI, as
benchmarked by cross-compiling sqlite3 amalgamation for AArch64
on x86 machine.
Reviewed By: qcolombet
Differential Revision: https://reviews.llvm.org/D46809
llvm-svn: 333125
Summary:
Since clang r332929 these two headers throw errors when included from somewhere else than their wrapper header. It seems marking them as textual is the best way to fix the builds.
Fixes this new module build error:
While building module '_Builtin_intrinsics' imported from ...:
In file included from <module-includes>:2:
In file included from lib/clang/7.0.0/include/immintrin.h:54:
In file included from lib/clang/7.0.0/include/wmmintrin.h:29:
lib/clang/7.0.0/include/__wmmintrin_aes.h:25:2: error: "Never use <__wmmintrin_aes.h> directly; include <wmmintrin.h> instead."
#error "Never use <__wmmintrin_aes.h> directly; include <wmmintrin.h> instead."
Reviewers: rsmith, v.g.vassilev, craig.topper
Reviewed By: craig.topper
Subscribers: craig.topper, cfe-commits
Differential Revision: https://reviews.llvm.org/D47277
llvm-svn: 333123
This is a small follow-up to the revisions r333117 and r331663.
1. Avoid the name conflicts of the generated variables for prefixes.
2. Apply clang-format -i -style=llvm to llvm-objcopy.cpp once again.
3. Add a test for the flag with double dash.
Test plan: make check-all
llvm-svn: 333120
Besides normal updates this change also contains a bug-fix to in
isl_coalesce which broke the AOSP buildbot. Thanks to Michael Kruse for
reporting this bug and Sven Verdoolage for fixing this bug.
llvm-svn: 333118
Summary:
The most common usecase for -runs=0 is for generating code coverage
over some corpus. Coverage reports based on sancov are about to be deprecated,
which means some external coverage solution will be used, e.g. Clang source
based code coverage, which does not use any sancov instrumentations and thus
libFuzzer would consider any input to be not interesting in that case.
Reviewers: kcc
Reviewed By: kcc
Subscribers: alex, delcypher, #sanitizers, llvm-commits
Differential Revision: https://reviews.llvm.org/D47271
llvm-svn: 333116
Implemente patterns to extract [Un]signed Word vector element and convert to
quad-precision.
Differential Revision: https://reviews.llvm.org/D46536
llvm-svn: 333115
This patch continues a series of patches started by r332907 (reapplied
as r332917)
In this commit we sort type checks towards the beginning of every rule
within the MatchTable as they fail often and it's best to fail early.
This is expected to decrease time GlobalISel spends in its
InstructionSelect pass by roughly 7% for an -O0 build as measured on
sqlite3-amalgamation (http://sqlite.org/download.html) targeting
AArch64. The amalgamation is a large single-file C-source that makes
compiler backend performance improvements to stand out from frontend.
It's also a part of CTMark.
Reviewers: qcolombet, dsanders, bogner, aemerson, javed.absar
Reviewed By: qcolombet
Subscribers: rovka, llvm-commits, kristof.beyls
Differential Revision: https://reviews.llvm.org/D44700
llvm-svn: 333114
Implemente patterns to extract [Un]signed DWord vector element and convert to
quad-precision.
Differential Revision: https://reviews.llvm.org/D46333
llvm-svn: 333112
Summary:
StructurizeCFG::orderNodes basically uses a reverse post-order (RPO) traversal of the region list to get the order.
The only problem with it is that sometimes backedges for outer loops will be visited before backedges for inner loops.
To solve this problem, a loop depth based approach has been used to make sure all blocks in this loop has been visited
before moving on to outer loop.
However, we found a problem for a SubRegion which is a loop itself:
--> BB1 --> BB2 --> BB3 -->
In this case, BB2 is a SubRegion (loop), and thus its loopdepth is different than that of BB1 and BB3. This fact will lead
BB2 to be placed in the wrong order.
In this work, we treat the SubRegion as a special case and use its exit block to determine the loop and its depth
to guard the sorting.
Reviewers:
arsenm, jlebar
Differential Revision:
https://reviews.llvm.org/D46912
llvm-svn: 333111
This matches the Intel documentation which shows them available by importing immintrin.h. x86intrin.h also includes immintrin.h so anyone including x86intrin.h will still get them.
This is different than gcc, but I don't think we were a perfect match there already. I'm unclear what gcc's policy is about how they choose which to add things to.
Differential Revision: https://reviews.llvm.org/D47182
llvm-svn: 333110
Summary:
`sanitizer_internal_defs.h` didn't have this define, which will be useful in
an upcoming CL.
Reviewers: alekseyshl
Reviewed By: alekseyshl
Subscribers: kubamracek, delcypher, llvm-commits, #sanitizers
Differential Revision: https://reviews.llvm.org/D47270
llvm-svn: 333109
Summary:
Finally fixes [[ https://bugs.llvm.org/show_bug.cgi?id=6773 | PR6773 ]].
Now that the backend is all done, we can finally fold it!
The canonical unfolded masked merge pattern is
```(x & m) | (y & ~m)```
There is a second, equivalent variant:
```(x | ~m) & (y | m)```
Only one of them (the or-of-and's i think) is canonical.
And if the mask is not a constant, we should fold it to:
```((x ^ y) & M) ^ y```
https://rise4fun.com/Alive/ndQw
Reviewers: spatel, craig.topper
Reviewed By: spatel
Subscribers: nicholas, RKSimon, llvm-commits
Differential Revision: https://reviews.llvm.org/D46814
llvm-svn: 333106
The default statement granularity changed in a recent change by Micheal. To
avoid forwad-porting the testcases, enable the legacy behaviour again in these tests.
llvm-svn: 333105
Summary: This patch adds a PDT constructor from Function and lets codes previously using a local class to do this use PostDominatorTree class directly.
Reviewers: davide, kuhar, grosser, dberlin
Reviewed By: kuhar
Author: NutshellySima
Subscribers: llvm-commits
Differential Revision: https://reviews.llvm.org/D46709
llvm-svn: 333102
r333093 introduced several warnings (-Wlogical-not-parentheses,
-Wbool-compare).
Adding parentheses in MipsSEInstrInfo::isCopyInstr() to silence it.
llvm-svn: 333097
This patch implements the "block reciprocal throughput" computation in the
SummaryView.
The block reciprocal throughput is computed as the MAX of:
- NumMicroOps / DispatchWidth
- Resource Cycles / #Units (for every resource consumed).
The block throughput is bounded from above by the hardware dispatch throughput.
That is because the DispatchWidth is an upper bound on how many opcodes can be part
of a single dispatch group.
The block throughput is also limited by the amount of hardware parallelism. The
number of available resource units affects how the resource pressure is
distributed, and also how many blocks can be delivered every cycle.
llvm-svn: 333095
This property is needed in order to follow values movement between
registers. This property is used in TII to implement method that
returns true if simple copy like instruction is recognized, along
with source and destination machine operands.
Patch by Nikola Prica.
Differential Revision: https://reviews.llvm.org/D45204
llvm-svn: 333093
Now that the LLVM_DEBUG() macro landed on the various sub-projects
the DEBUG macro can be removed.
Also change the new uses of DEBUG to LLVM_DEBUG.
Differential Revision: https://reviews.llvm.org/D46952
llvm-svn: 333091
statement naming
- A recent ppcg/isl update caused the grid/block size upper bounds to
deviate by one from the oracle. This is not an effect that's visible at
runtime.
- Statement naming changed in polly. Update the testcases.
llvm-svn: 333090
Summary:
Currently we do not import the implicit CXXRecordDecl of a
ClassTemplateSpecializationDecl. This patch fixes it.
Reviewers: a.sidorin, xazax.hun, r.stahl
Subscribers: rnkovacs, dkrupp, cfe-commits
Differential Revision: https://reviews.llvm.org/D47057
llvm-svn: 333086
Summary:
This patch fixes two bugs in clang-format where the template wrapper doesn't skip over
comments causing a long template declaration to not be split into multiple lines.
These were latent and exposed by r332436.
Reviewers: sammccall
Reviewed By: sammccall
Subscribers: klimek, cfe-commits
Differential Revision: https://reviews.llvm.org/D47257
llvm-svn: 333085
Summary:
We fail to import a `ClassTemplateDecl` if the "To" context already
contains a definition and then a forward decl. This is because
`localUncachedLookup` does not find the definition. This is not a
lookup error, the parser behaves differently than assumed in the
importer code. A `DeclContext` contains one DenseMap (`LookupPtr`)
which maps names to lists. The list is a special list `StoredDeclsList`
which is optimized to have one element. During building the initial
AST, the parser first adds the definition to the `DeclContext`. Then
during parsing the second declaration (the forward decl) the parser
again calls `DeclContext::addDecl` but that will not add a new element
to the `StoredDeclsList` rarther it simply overwrites the old element
with the most recent one. This patch fixes the error by finding the
definition in the redecl chain. Added tests for the same issue with
`CXXRecordDecl` and with `ClassTemplateSpecializationDecl`. These tests
pass and they pass because in `VisitRecordDecl` and in
`VisitClassTemplateSpecializationDecl` we already use
`D->getDefinition()` after the lookup.
Reviewers: a.sidorin, xazax.hun, szepet
Subscribers: rnkovacs, dkrupp, cfe-commits
Differential Revision: https://reviews.llvm.org/D46950
llvm-svn: 333082