Commit Graph

364730 Commits

Author SHA1 Message Date
Matt Arsenault 201f770f16 GlobalISel: Add and_trivial_mask to all_combines
Also make up a new category of combines.
2020-08-27 16:42:09 -04:00
Krzysztof Parzyszek 4ef9275b9b [Hexagon] Emit better 32-bit multiplication sequence for HVXv62+ 2020-08-27 15:24:32 -05:00
Eli Friedman 8d21985a75 [RegisterScavenging] Delete dead function unprocess(). 2020-08-27 13:19:32 -07:00
Shinji Okumura 58d257b290 [Attributor] Do not add AA to dependency graph after the update stage
If an AA is registered to the dependency graph in the manifest stage, Attributor aborts in `::manifestAttributes()`.
This patch prevents such termination.

Reviewed By: jdoerfert

Differential Revision: https://reviews.llvm.org/D86734
2020-08-28 05:16:18 +09:00
Craig Topper 17ceda99d3 [CodeGen] Use an AttrBuilder to bulk remove 'target-cpu', 'target-features', and 'tune-cpu' before re-adding in CodeGenModule::setNonAliasAttributes.
I think the removeAttributes interface should be faster than
calling removeAttribute 3 times.
2020-08-27 12:54:20 -07:00
Saiyedul Islam ff260ad0e0 [OpenMP] Ensure testing for versions 4.5 and default - Part 3
This third patch in the series removes version 5.0 string from
test cases making them check for default version. It also add test
cases for version 4.5.

Reviewed By: ABataev

Differential Revision: https://reviews.llvm.org/D85214
2020-08-27 19:37:04 +00:00
Roman Lebedev b85f91fdce
[InstSimplify] SimplifyPHINode(): check that instruction is in basic block first
As pointed out in post-commit review, this can legally be called
on instructions that are not inserted into basic blocks,
so don't blindly assume that there is basic block.
2020-08-27 22:32:03 +03:00
Christopher Tetreault 035833ae42 [SVE] Remove bad call to VectorType::getNumElements() from HeapProfiler
Reviewed By: tejohnson

Differential Revision: https://reviews.llvm.org/D86727
2020-08-27 12:16:00 -07:00
Yang Fan 6e26e49edf [analyzer] NFC: Fix wrong parameter name in printFormattedEntry.
Parameters were in a different order in the header and in the implementation.

Fix surrounding comments a bit.

Differential Revision: https://reviews.llvm.org/D86691
2020-08-27 12:15:26 -07:00
Yang Fan 37c21dbb3a [analyzer] Fix the debug print about debug egraph dumps requiring asserts.
There's no need to remind people about that when clang *is* built with asserts.

Differential Revision: https://reviews.llvm.org/D86334
2020-08-27 12:15:26 -07:00
Adam Balogh 4448affede [analyzer] pr47037: CastValueChecker: Support for the new variadic isa<>.
llvm::isa<>() and llvm::isa_and_not_null<>() template functions recently became
variadic. Unfortunately this causes crashes in case of isa_and_not_null<>()
and incorrect behavior in isa<>(). This patch fixes this issue.

Differential Revision: https://reviews.llvm.org/D85728
2020-08-27 12:15:25 -07:00
Adam Balogh 5a9e778939 [analyzer] NFC: Store the pointee/referenced type for dynamic type tracking.
The successfulness of a dynamic cast depends only on the C++ class, not the pointer or reference. Thus if *A is a *B, then &A is a &B,
const *A is a const *B etc. This patch changes DynamicCastInfo to store
and check the cast between the unqualified pointed/referenced types.
It also removes e.g. SubstTemplateTypeParmType from both the pointer
and the pointed type.

Differential Revision: https://reviews.llvm.org/D85752
2020-08-27 12:15:23 -07:00
Dokyung Song 52f1df0923 Recommit "[libFuzzer] Fix value-profile-load test."
value-profile-load.test needs adjustment with a mutator change in
bb54bcf849, which reverted as of now, but will be
recommitted after landing this patch.

This patch makes value-profile-load.test more friendly to (and aware of) the
current value profiling strategy, which is based on the hamming as well as the
absolute distance. To this end, this patch adjusts the set of input values that
trigger an expected crash. More specifically, this patch now uses a single value
0x01effffe as a crashing input, because this value is close to values like
{0x1ffffff, 0xffffff, ...}, which are very likely to be added to the corpus per
the current hamming- and absolute-distance-based value profiling strategy. Note
that previously the crashing input values were {1234567 * {1, 2, ...}, s.t. <
INT_MAX}.

Every byte in the chosen value 0x01effeef is intentionally different; this was
to make it harder to find the value without the intermediate inputs added to the
corpus by the value profiling strategy.

Also note that LoadTest.cpp now uses a narrower condition (Size != 8) for
initial pruning of inputs, effectively preventing libFuzzer from generating
inputs longer than necessary and spending time on mutating such long inputs in
the corpus - a functionality not meant to be tested by this specific test.

Differential Revision: https://reviews.llvm.org/D86247
2020-08-27 19:12:30 +00:00
Haojian Wu 3f8a0ecdaa [libcxx] Fix the broken test after D82657.
Differential Revision: https://reviews.llvm.org/D86685
2020-08-27 21:12:11 +02:00
Shinji Okumura c5e6872ec6 [Attributor] Guarantee getAAFor not to update AA in the manifestation stage
If we query an AA with `Attributor::getAAFor` in `AbstractAttribute::manifest`, the AA may be updated.
This patch makes use of the phase flag in Attributor, and handle `getAAFor` behavior according to the flag.

Reviewed By: jdoerfert

Differential Revision: https://reviews.llvm.org/D86635
2020-08-28 04:07:42 +09:00
Vincent Zhao 28a7dfa33d [MLIR] Fixed missing constraint append when adding an AffineIfOp domain
The prior diff that introduced `addAffineIfOpDomain` missed appending
constraints from the ifOp domain. This revision fixes this problem.

Differential Revision: https://reviews.llvm.org/D86421
2020-08-28 00:34:23 +05:30
Christopher Tetreault 5e63083435 [SVE] Remove calls to VectorType::getNumElements from Transforms/Vectorize
Reviewed By: spatel

Differential Revision: https://reviews.llvm.org/D82056
2020-08-27 12:02:20 -07:00
Saiyedul Islam a1bdf8f545 [OpenMP] Ensure testing for versions 4.5 and default - Part 2
Many OpenMP Clang tests do not RUN for version 4.5 and the default
version. This second patch in the series handles test cases which
require updation in CHECK lines along with adding RUN lines for
the default version. It involves updating line number of pragmas.

Reviewed By: ABataev

Differential Revision: https://reviews.llvm.org/D85150
2020-08-27 18:50:52 +00:00
Kiran Chandramohan 875074c8a9 [OpenMP][MLIR] Conversion pattern for OpenMP to LLVM
Adding a conversion pattern for the parallel Operation. This will
help the conversion of parallel operation with standard dialect to
parallel operation with llvm dialect. The type conversion of the block
arguments in a parallel region are controlled by the pattern for the
parallel Operation. Without this pattern, a parallel Operation with
block arguments cannot be converted from standard to LLVM dialect.
Other OpenMP operations without regions are marked as legal. When
translation of OpenMP operations with regions are added then patterns
for these operations can also be added.
Also uses all the standard to llvm patterns. Patterns of other dialects
can be added later if needed.

Reviewed By: rriddle

Differential Revision: https://reviews.llvm.org/D86273
2020-08-27 19:32:15 +01:00
Jez Ng d2b845dd6c [lld-macho] Disable invalid/stub-link.s test for Mac
It seems to be failing on some Google Buildbots.

This diff also includes a minor fix for the install name of one of
libSystem's re-exports. I don't think it's the cause of the test
failure, though. The wrong install name just meant that the symbol
lookup failure would still happen, but it would have been caused by the
re-export not being found, instead of the arch failing to match.

Differential Revision: https://reviews.llvm.org/D86728
2020-08-27 11:24:22 -07:00
Louis Dionne 21a1a263a6 [libc++][NFC] Define functor's call operator inline
This fixes a mismatched visibility attribute on the call operator in
addition to making the code clearer. Given this is a simple lambda
in essence, the intent has always been to give it inline visibility.
2020-08-27 14:20:34 -04:00
Christopher Tetreault 5a55e2781c [SVE] Remove calls to VectorType::getNumElements from IR
Reviewed By: RKSimon

Differential Revision: https://reviews.llvm.org/D81500
2020-08-27 11:16:10 -07:00
Matt Arsenault e53b799779 GlobalISel: Use & operator on KnownBits
Avoid repeating for zero and one
2020-08-27 14:07:18 -04:00
Matt Arsenault 531f7063ba GlobalISel: Implement known bits for G_MERGE_VALUES 2020-08-27 14:07:18 -04:00
Mikhail Maltsev 433f2680c9 [ARM][BFloat16] Change types of some Arm and AArch64 bf16 intrinsics
Add bitcode files which got truncated to 0 length in phabricator.

Differential Revision: https://reviews.llvm.org/D86146
2020-08-27 18:52:59 +01:00
Matt Arsenault 9607ccf626 GlobalISel: Remove leftover lit.local.cfg
The global-isel feature has been required for a long time and was
removed in c9455d3c57, so this was
causing all tests to be skipped.
2020-08-27 13:49:06 -04:00
Mikhail Maltsev ae1396c7d4 [ARM][BFloat16] Change types of some Arm and AArch64 bf16 intrinsics
This patch adjusts the following ARM/AArch64 LLVM IR intrinsics:
- neon_bfmmla
- neon_bfmlalb
- neon_bfmlalt
so that they take and return bf16 and float types. Previously these
intrinsics used <8 x i8> and <4 x i8> vectors (a rudiment from
implementation lacking bf16 IR type).

The neon_vbfdot[q] intrinsics are adjusted similarly. This change
required some additional selection patterns for vbfdot itself and
also for vector shuffles (in a previous patch) because of SelectionDAG
transformations kicking in and mangling the original code.

This patch makes the generated IR cleaner (less useless bitcasts are
produced), but it does not affect the final assembly.

Reviewed By: dmgreen

Differential Revision: https://reviews.llvm.org/D86146
2020-08-27 18:43:16 +01:00
Craig Topper ba852e1e19 [X86] Don't call hasFnAttribute and getFnAttribute for 'prefer-vector-width' and 'min-legal-vector-width' in getSubtargetImpl
We only need to call getFnAttribute and then check if the Attribute
is None or not.
2020-08-27 10:40:20 -07:00
Owen Anderson e9d9a61208 Reapply D70800: Fix AArch64 AAPCS frame record chain
Original Commit Message:
After the commit r368987 (rG643adb55769e) was landed, the frame record (FP and LR register)
may be placed in the middle of a stack frame if a function has both callee-saved
general-purpose registers and floating point registers. This will break the stack unwinders
that simply walk through the frame records (based on the guarantee from AAPCS64
"The Frame Pointer" section). This commit fixes the problem by adding the frame record offset.

Patch By: logan
Differential Revision: D70800
2020-08-27 17:29:41 +00:00
Teresa Johnson 5b9d462b7d [HeapProf] Fix bot failures from instrumentation pass
Fix bot failure from 7ed8124d46f94601d5f1364becee9cee8538265e:
http://lab.llvm.org:8011/builders/llvm-clang-x86_64-expensive-checks-ubuntu/builds/8533

Since we are always using dynamic shadow,
insertDynamicShadowAtFunctionEntry should always return true for
modifying the function.
2020-08-27 10:21:19 -07:00
LLVM GN Syncbot b3efa65363 [gn build] Port 7ed8124d46 2020-08-27 17:08:02 +00:00
Arthur Eubanks 897839425b [gn build] Manually port c9455d3 2020-08-27 10:05:34 -07:00
Arthur Eubanks 8bdb98c781 [test][Inliner] Make always-inline.ll work with NPM
The NPM doesn't support call-site alwaysinline as described in the comments.

Also make NPM runs more similar to legacy PM runs.

Reviewed By: ychen, asbirlea

Differential Revision: https://reviews.llvm.org/D86663
2020-08-27 10:00:54 -07:00
Aditya Nandakumar db464a3dbf [GISel] Add new GISel combiners for G_SELECT
https://reviews.llvm.org/D83833

Patch adds two new GICombinerRules for G_SELECT. The rules include:
combining selects with undef comparisons into their first selectee value,
and to combine away selects with constant comparisons. Patch additionally
adds a new combiner test for the AArch64 target to test these new G_SELECT
combiner rules and the existing select_same_val combiner rule.

Patch by  mkitzan
2020-08-27 09:40:15 -07:00
Jonas Devlieghere a7e4a17735 [lldb] Make lldb-argdumper a dependency of liblldb
Always make lldb-argdumper a dependency of liblldb. Currently it is only
a dependency of the python swig target because of the relative symlink
in the python resource directory. That means that the dependency won't
be there when LLDB_ENABLE_PYTHON is disabled.

Differential revision: https://reviews.llvm.org/D86722
2020-08-27 09:31:02 -07:00
Jonas Devlieghere b981924bdd [lldb] Move triple construction out of getArchCFlags in DarwinBuilder (NFC)
Move the construction of the triple out of getArchCFlags in the
DarwinBuilder.
2020-08-27 09:31:01 -07:00
Arthur Eubanks dd04fa17d7 [OCaml] Remove add_constant_propagation
After https://reviews.llvm.org/D85159.
2020-08-27 09:30:21 -07:00
Simon Moll c48b06c44f [sda][nfc] clang-formatting 2020-08-27 18:27:44 +02:00
Shinji Okumura 7a68f0f1e0 [Attributor] Add a phase flag to Attributor
Add a new flag that indicates which stage in the process we are in.
This flag is introduced for handling behavior of `getAAFor` according to the stage. (discussed in D86635)

Reviewed By: jdoerfert

Differential Revision: https://reviews.llvm.org/D86678
2020-08-28 01:16:38 +09:00
Aditya Nandakumar 5c2db1655b [GISel]: Fix one more CSE Non determinism
https://reviews.llvm.org/D86676

Sometimes we can have the following code

 x:gpr(s32) = G_OP

Say we build G_OP2 to the same x and then delete the previous instruction. Using something like

 Register X = ...;
 auto NewMIB = CSEBuilder.buildOp2(X, ... args);

Currently there's a mismatch in how NewMIB is profiled and inserted into the CSEMap (ie it doesn't consider register bank/register class along with type).Unify the profiling by refactoring and calling the common method.

This was found by turning on the CSEInfo::verify in at the end of each of our GISel passes which turns inconsistent state/non determinism in CSEing into crashes which likely usually indicates missing calls to Observer on mutations (the most common case). Here non determinism usually means not cseing sometimes, but almost never about producing incorrect code.
Also this patch adds this verification at the end of the combiners as well.
2020-08-27 09:06:21 -07:00
Lucas Prates 3d943bcd22 [CodeGen] Properly propagating Calling Convention information when lowering vector arguments
When joining the legal parts of vector arguments into its original value
during the lower of Formal Arguments in SelectionDAGBuilder, the Calling
Convention information was not being propagated for the handling of each
individual parts. The same did not happen when lowering calls, causing a
mismatch.

This patch fixes the issue by properly propagating the Calling
Convention details.

This fixes Bugzilla #47001.

Reviewed By: arsenm

Differential Revision: https://reviews.llvm.org/D86715
2020-08-27 17:01:10 +01:00
Benjamin Kramer fddf543e6e [MLIR][GPUToSPIRV] Fix use-after-free. Found by asan. 2020-08-27 17:57:11 +02:00
Teresa Johnson 7ed8124d46 [HeapProf] Clang and LLVM support for heap profiling instrumentation
See RFC for background:
http://lists.llvm.org/pipermail/llvm-dev/2020-June/142744.html

Note that the runtime changes will be sent separately (hopefully this
week, need to add some tests).

This patch includes the LLVM pass to instrument memory accesses with
either inline sequences to increment the access count in the shadow
location, or alternatively to call into the runtime. It also changes
calls to memset/memcpy/memmove to the equivalent runtime version.
The pass is modeled on the address sanitizer pass.

The clang changes add the driver option to invoke the new pass, and to
link with the upcoming heap profiling runtime libraries.

Currently there is no attempt to optimize the instrumentation, e.g. to
aggregate updates to the same memory allocation. That will be
implemented as follow on work.

Differential Revision: https://reviews.llvm.org/D85948
2020-08-27 08:50:35 -07:00
Mikhail Maltsev a19fd1aab5 Revert "[libcxx] Fix compile for BUILD_EXTERNAL_THREAD_LIBRARY"
This reverts commit 3b71f91558.

The commit is breaking some build bots.
2020-08-27 16:48:10 +01:00
Roman Lebedev 6102310d81
[InstSimplify][EarlyCSE] Try to CSE PHI nodes in the same basic block
Apparently, we don't do this, neither in EarlyCSE, nor in InstSimplify,
nor in (old) GVN, but do in NewGVN and SimplifyCFG of all places..

While i could teach EarlyCSE how to hash PHI nodes,
we can't really do much (anything?) even if we find two identical
PHI nodes in different basic blocks, same-BB case is the interesting one,
and if we teach InstSimplify about it (which is what i wanted originally,
https://reviews.llvm.org/D86530), we get EarlyCSE support for free.

So i would think this is pretty uncontroversial.

On vanilla llvm test-suite + RawSpeed, this has the following effects:
```
| statistic name                                     | baseline  | proposed  |      Δ |        % |    \|%\| |
|----------------------------------------------------|-----------|-----------|-------:|---------:|---------:|
| instsimplify.NumPHICSE                             | 0         | 23779     |  23779 |    0.00% |    0.00% |
| asm-printer.EmittedInsts                           | 7942328   | 7942392   |     64 |    0.00% |    0.00% |
| assembler.ObjectBytes                              | 273069192 | 273084704 |  15512 |    0.01% |    0.01% |
| correlated-value-propagation.NumPhis               | 18412     | 18539     |    127 |    0.69% |    0.69% |
| early-cse.NumCSE                                   | 2183283   | 2183227   |    -56 |    0.00% |    0.00% |
| early-cse.NumSimplify                              | 550105    | 542090    |  -8015 |   -1.46% |    1.46% |
| instcombine.NumAggregateReconstructionsSimplified  | 73        | 4506      |   4433 | 6072.60% | 6072.60% |
| instcombine.NumCombined                            | 3640264   | 3664769   |  24505 |    0.67% |    0.67% |
| instcombine.NumDeadInst                            | 1778193   | 1783183   |   4990 |    0.28% |    0.28% |
| instcount.NumCallInst                              | 1758401   | 1758799   |    398 |    0.02% |    0.02% |
| instcount.NumInvokeInst                            | 59478     | 59502     |     24 |    0.04% |    0.04% |
| instcount.NumPHIInst                               | 330557    | 330533    |    -24 |   -0.01% |    0.01% |
| instcount.TotalInsts                               | 8831952   | 8832286   |    334 |    0.00% |    0.00% |
| simplifycfg.NumInvokes                             | 4300      | 4410      |    110 |    2.56% |    2.56% |
| simplifycfg.NumSimpl                               | 1019808   | 999607    | -20201 |   -1.98% |    1.98% |
```
I.e. it fires ~24k times, causes +110 (+2.56%) more `invoke` -> `call`
transforms, and counter-intuitively results in *more* instructions total.

That being said, the PHI count doesn't decrease that much,
and looking at some examples, it seems at least some of them
were previously getting PHI CSE'd in SimplifyCFG of all places..

I'm adjusting `Instruction::isIdenticalToWhenDefined()` at the same time.
As a comment in `InstCombinerImpl::visitPHINode()` already stated,
there are no guarantees on the ordering of the operands of a PHI node,
so if we just naively compare them, we may false-negatively say that
the nodes are not equal when the only difference is operand order,
which is especially important since the fold is in InstSimplify,
so we can't rely on InstCombine sorting them beforehand.

Fixing this for the general case is costly (geomean +0.02%),
and does not appear to catch anything in test-suite, but for
the same-BB case, it's trivial, so let's fix at least that.

As per http://llvm-compile-time-tracker.com/compare.php?from=04879086b44348cad600a0a1ccbe1f7776cc3cf9&to=82bdedb888b945df1e9f130dd3ac4dd3c96e2925&stat=instructions
this appears to cause geomean +0.03% compile time increase (regression),
but geomean -0.01%..-0.04% code size decrease (improvement).
2020-08-27 18:47:04 +03:00
Roman Lebedev 94d3dd8b08
[NFC][EarlyCSE][InstSimplify] Add tests for CSE of PHI nodes
PHI nodes depend on the block they're in,
so we can only deal with the most basic case of same-BB PHI's.
2020-08-27 18:47:03 +03:00
Russell Gallop c9455d3c57 [Test] Tidy up loose ends from LLVM_HAS_GLOBAL_ISEL
This hasn't been allowed as a build option since r309990

Remove leftover REQUIRES: global-isel

Differential Revision: https://reviews.llvm.org/D86714
2020-08-27 16:36:27 +01:00
Louis Dionne 49644cd941 [libc++] Install a more recent CMake on libc++ builders 2020-08-27 11:26:27 -04:00
David Nicuesa 3b71f91558 [libcxx] Fix compile for BUILD_EXTERNAL_THREAD_LIBRARY
Fix compilation with -DLIBCXX_BUILD_EXTERNAL_THREAD_LIBRARY when using clang. Now linking target  'cxx_external_threads' with 'cxx-headers'. Fix mismatching visibility for `libcpp_timed_backoff_policy` function in file <__threading_support>.

Reviewed By: #libc, ldionne

Differential Revision: https://reviews.llvm.org/D86598
2020-08-27 16:24:19 +01:00
Cullen Rhodes 42587345a3 [CodeGen][AArch64] Support arm_sve_vector_bits attribute
This patch implements codegen for the 'arm_sve_vector_bits' type
attribute, defined by the Arm C Language Extensions (ACLE) for SVE [1].
The purpose of this attribute is to define vector-length-specific (VLS)
versions of existing vector-length-agnostic (VLA) types.

VLSTs are represented as VectorType in the AST and fixed-length vectors
in the IR everywhere except in function args/return. Implemented in this
patch is codegen support for the following:

  * Implicit casting between VLA <-> VLS types.
  * Coercion of VLS types in function args/return.
  * Mangling of VLS types.

Casting is handled by the CK_BitCast operation, which has been extended
to support the two new vector kinds for fixed-length SVE predicate and
data vectors, where the cast is implemented through memory rather than a
bitcast which is unsupported. Implementing this as a normal bitcast
would require relaxing checks in LLVM to allow bitcasting between
scalable and fixed types. Another option was adding target-specific
intrinsics, although codegen support would need to be added for these
intrinsics. Given this, casting through memory seemed like the best
approach as it's supported today and existing optimisations may remove
unnecessary loads/stores, although there is room for improvement here.

Coercion of VLSTs in function args/return from fixed to scalable is
implemented through the AArch64 ABI in TargetInfo.

The VLA and VLS types are defined by the ACLE to map to the same
machine-level SVE vectors. VLS types are mangled in the same way as:

  __SVE_VLS<typename, unsigned>

where the first argument is the underlying variable-length type and the
second argument is the SVE vector length in bits. For example:

  #if __ARM_FEATURE_SVE_BITS==512
  // Mangled as 9__SVE_VLSIu11__SVInt32_tLj512EE
  typedef svint32_t vec __attribute__((arm_sve_vector_bits(512)));
  // Mangled as 9__SVE_VLSIu10__SVBool_tLj512EE
  typedef svbool_t pred __attribute__((arm_sve_vector_bits(512)));
  #endif

The latest ACLE specification (00bet5) does not contain details of this
mangling scheme, it will be specified in the next revision.  The
mangling scheme is otherwise defined in the appendices to the Procedure
Call Standard for the Arm Architecture, see [2] for more information.

[1] https://developer.arm.com/documentation/100987/latest
[2] https://github.com/ARM-software/abi-aa/blob/master/aapcs64/aapcs64.rst#appendix-c-mangling

Reviewed By: efriedma

Differential Revision: https://reviews.llvm.org/D85743
2020-08-27 15:11:58 +00:00