Commit Graph

396140 Commits

Author SHA1 Message Date
Matt Morehouse 7ca2b9aac7 [libFuzzer] Add missing include on Darwin. 2021-08-05 12:27:13 -07:00
Fangrui Song c38efb4899 [clang] Implement -falign-loops=N (N is a power of 2) for non-LTO
GCC supports multiple forms of -falign-loops=.
-falign-loops= is currently ignored in Clang.

This patch implements the simplest but the most useful form where N is a
power of 2.

The underlying implementation uses a `llvm::TargetOptions` option for now.
Bitcode generation ignores this option.

Differential Revision: https://reviews.llvm.org/D106701
2021-08-05 12:17:50 -07:00
Bill Wendling 4d293f215d [llvm-diff] Create libLLVMDiff library
Some tools may want to use the LLVM "diff" code. Move the code into a
library for easy use.

No functionality change intende.

Differential Revision: https://reviews.llvm.org/D107392
2021-08-05 12:05:50 -07:00
Jon Roelofs 98f38c151b [AArch64][GlobalISel] Legalize ctpop s128
This is re-landing the same patch again, but without the changes to
LegalizerHelper that regressed the Mips test:

test/CodeGen/Mips/GlobalISel/llvm-ir/ctpop.ll

Differential revision: https://reviews.llvm.org/D106494
2021-08-05 11:54:53 -07:00
Matt Morehouse 881faf4190 Enable extra coverage counters on Windows
- Enable extra coverage counters on Windows.
- Update extra_counters.test to run on Windows also.
- Update TableLookupTest.cpp to include the required pragma/declspec for the extra coverage counters.

Patch By: MichaelSquires

Reviewed By: morehouse

Differential Revision: https://reviews.llvm.org/D106676
2021-08-05 11:40:15 -07:00
Arthur O'Dwyer 892990c56c [libc++] IWYU to fix complaints when compiling with Modules. NFCI.
Differential Revision: https://reviews.llvm.org/D107583
2021-08-05 14:30:08 -04:00
Chris Jackson 113a06f7a5 {DebugInfo][LSR] Don't cache dbg.value that are already undef
The SCEV-based salvaging method caches dbg.value information pre-LSR so
that salvaging may be attempted post-LSR. If the dbg.value are already
undef pre-LSR then a salvage attempt would be fruitless, so avoid
caching them.

Reviewed By: StephenTozer

Differential Revision: https://reviews.llvm.org/D107448
2021-08-05 19:16:43 +01:00
Matt Morehouse ec5137029b Revert "[llvm-diff] Create libLLVMDiff library"
This reverts commit 9854f2f30f since it
broke all the builds.
2021-08-05 11:10:58 -07:00
Dimitry Andric b260f3fdda sanitizer_common: disable thread safety annotations for googletest
Recently in 0da172b176 thread safety warnings-as-errors were enabled.
However, googletest is currently not compatible with thread safety
annotations. On FreeBSD, which has the pthread functions marked with
such annotations, this results in errors when building the compiler-rt
tests:

    In file included from compiler-rt/lib/interception/tests/interception_test_main.cpp:15:
    In file included from llvm/utils/unittest/googletest/include/gtest/gtest.h:62:
    In file included from llvm/utils/unittest/googletest/include/gtest/internal/gtest-internal.h:40:
    llvm/utils/unittest/googletest/include/gtest/internal/gtest-port.h:1636:3: error: mutex 'mutex_' is still held at the end of function [-Werror,-Wthread-safety-analysis]
      }
      ^
    llvm/utils/unittest/googletest/include/gtest/internal/gtest-port.h:1633:32: note: mutex acquired here
        GTEST_CHECK_POSIX_SUCCESS_(pthread_mutex_lock(&mutex_));
                                   ^
    llvm/utils/unittest/googletest/include/gtest/internal/gtest-port.h:1645:32: error: releasing mutex 'mutex_' that was not held [-Werror,-Wthread-safety-analysis]
        GTEST_CHECK_POSIX_SUCCESS_(pthread_mutex_unlock(&mutex_));
                                   ^
    2 errors generated.

At some point googletest will hopefully be made compatible with thread
safety annotations, but for now add corresponding `-Wno-thread-*` flags
to `COMPILER_RT_GTEST_CFLAGS` to silence these warnings-as-errors.

Reviewed By: dvyukov

Differential Revision: https://reviews.llvm.org/D107491
2021-08-05 20:07:24 +02:00
Geoffrey Martin-Noble f8b6e1faa9 [Bazel] Update for 9854f2f30f (Diff library)
Updates the Bazel build for
https://github.com/llvm/llvm-project/commit/9854f2f30f by extracting a
library from llvm-diff. Note that this does not include the new
llvm-livepatch binary, for which the CMake file was added accidentally
and reverted in https://github.com/llvm/llvm-project/commit/fec8f1a008.

Differential Revision: https://reviews.llvm.org/D107586
2021-08-05 11:01:44 -07:00
Hedin Garca a9628e96ca [libc] Add diff and perf targets for more math functions
Comparing the run time of math functions from LLVM libc
with the MSVCRT libc:
|function	|perf-LLVM libc		    |perf-MSVCRT
|ceilf		|2.36 mins (141491389600 ns)|47.10 sec (47100940100 ns)
|exp2f             |6.37 mins (358441794700 ns)|12.39 mins (719404388300 ns)
|expf		|6.35 mins (381204661800 ns)|6.17 mins (346150163200 ns)
|fabsf		|1.18 mins (78425546600 ns) |53.75 sec (53745301900 ns)
|floorf		|3.15 mins (164770963800 ns)|45.94 sec (45935988400 ns)
|logbf		|4.38 mins (262508058800 ns)|55.47 sec (55466377700 ns)
|nearbyintf	|3.20 mins (167972868000 ns)|9.13 mins (523822963600 ns)
|rintf		|3.20 mins (168001498700 ns)|22.35 mins (1341266448800 ns)
|roundf		|2.35 mins (141151500600 ns)|1.42 mins (85326429800 ns)
|truncf		|2.31 mins (114846424000 ns)|59.41 sec (59414309100 ns)

Evaluating the number of differing results in Windows:
|function	|diff
|ceilf          |8388606 differing results
|exp2f         |213303887 differing results
|expf           |193922 differing results
|fabsf          |8388606 differing results
|floorf         |8388606 differing results
|logbf          |0 differing results
|nearbyintf     |0 differing results
|rintf          |0 differing results
|roundf         |0 differing results
|truncf 	|0 differing results

Reviewed By: sivachandra

Differential Revision: https://reviews.llvm.org/D107462
2021-08-05 17:57:04 +00:00
Bill Wendling fec8f1a008 Remove unintended commit. 2021-08-05 10:51:37 -07:00
Jon Chesterfield 509854b69c [clang] Replace asm with __asm__ in cuda header
Asm is a gnu extension for C, so at present -fopenmp -std=c99
and similar fail to compile on nvptx, bug 51344

Changing to `__asm__` or `__asm` works for openmp, all three appear to work
for cuda. Suggesting `__asm__` here as `__asm` is used by MSVC with different
syntax, so this should make for better error diagnostics if the header is
passed to a compiler other than clang.

Reviewed By: tra, emankov

Differential Revision: https://reviews.llvm.org/D107492
2021-08-05 18:46:57 +01:00
Roman Lebedev c0586ff05d
[NFC][X86] combineX86ShuffleChain(): hoist Mask variable higher up
Having `NewMask` outside of an if and rebinding `BaseMask` `ArrayRef`
to it is confusing. Instead, just move the `Mask` vector higher up,
and change the code that earlier had no access to it but now does
to use `Mask` instead of `BaseMask`.

This has no other intentional changes.

This is a recommit of 35c0848b57,
that was reverted to simplify reversion of an earlier change.
2021-08-05 20:37:51 +03:00
Roman Lebedev 16605aea84
[NFC][Codegen][X86] Add testcase that hanged after D107009
From Benjamin Kramer @ https://lists.llvm.org/pipermail/llvm-commits/Week-of-Mon-20210802/945642.html
2021-08-05 20:37:50 +03:00
Bill Wendling 9854f2f30f [llvm-diff] Create libLLVMDiff library
Some tools may want to use the LLVM "diff" code. Move the code into a
library for easy use.

No functionality change intende.

Differential Revision: https://reviews.llvm.org/D107392
2021-08-05 10:36:01 -07:00
Fangrui Song 72d070b4db [ELF] Support copy relocation on non-default version symbols
Copy relocation on a non-default version symbol is unsupported and can crash at
runtime. Fortunately there is a one-line fix which works for most cases:
ensure `getSymbolsAt` unconditionally returns `ss`.

If two non-default version symbols are defined at the same place and both
are copy relocated, our implementation will copy relocated them into different
addresses. The pointer inequality is very unlikely an issue. In GNU ld, copy
relocating version aliases seems to create more pointer inequality problems than
us.

(
In glibc, sys_errlist@GLIBC_2.2.5 sys_errlist@GLIBC_2.3 sys_errlist@GLIBC_2.4
are defined at the same place, but it is unlikely they are all copy relocated in
one executable. Even if so, the variables are read-only and pointer inequality
should not be a problem.
)

Reviewed By: peter.smith

Differential Revision: https://reviews.llvm.org/D107535
2021-08-05 10:32:14 -07:00
Jonas Devlieghere a46bcc60e5 [lldb] Refactor IRExecutionUnit::FindInSymbols (NFC)
This patch refactors IRExecutionUnit::FindInSymbols. It eliminates a few
potential pitfalls and tries to be more explicit about the state carried
between symbol resolution attempts.

Differential revision: https://reviews.llvm.org/D107206
2021-08-05 10:18:14 -07:00
Jonas Devlieghere c020be17ce [lldb] Use a struct to pass function search options to Module::FindFunction
Rather than passing two booleans around, which is especially error prone
with them being next to each other, use a struct with named fields
instead.

Differential revision: https://reviews.llvm.org/D107295
2021-08-05 10:18:14 -07:00
Dan Liew a756239e72 Fix COMPILER_RT_DEBUG build for targets that don't support thread local storage.
022439931f added code that is only enabled
when COMPILER_RT_DEBUG is enabled. This code doesn't build on targets
that don't support thread local storage because the code added uses the
THREADLOCAL macro. Consequently the COMPILER_RT_DEBUG build broke for
some Apple targets (e.g. 32-bit iOS simulators).

```
/Volumes/user_data/dev/llvm/llvm.org/main/src/llvm-project/compiler-rt/lib/sanitizer_common/sanitizer_mutex.cpp:216:8: error: thread-local storage is not supported for the current target
static THREADLOCAL InternalDeadlockDetector deadlock_detector;
       ^
/Volumes/user_data/dev/llvm/llvm.org/main/src/llvm-project/compiler-rt/lib/sanitizer_common/sanitizer_internal_defs.h:227:24: note: expanded from macro 'THREADLOCAL'
 # define THREADLOCAL   __thread
                        ^
1 error generated.
```

To fix this, this patch introduces a `SANITIZER_SUPPORTS_THREADLOCAL`
macro that is `1` iff thread local storage is supported by the current
target. That condition is then added to `SANITIZER_CHECK_DEADLOCKS` to
ensure the code is only enabled when thread local storage is available.

The implementation of `SANITIZER_SUPPORTS_THREADLOCAL` currently assumes
Clang. See `llvm-project/clang/include/clang/Basic/Features.def` for the
definition of the `tls` feature.

rdar://81543007

Differential Revision: https://reviews.llvm.org/D107524
2021-08-05 10:07:25 -07:00
Ramesh Peri 976bd23612 [llvm-ar] Fix for handling thin archive with SYM64 and a test case for it
WHen thin archives are created which have symbol table of type SYM64 then all the tools will not work since they cannot read the files properly.
One can reproduce the problem as follows:
1. Take a hello world program and create an archive out of it. The SYM64_THRESHOLD=0 will force the generation of SYM64 symbol table.
    clang -c hello.cpp
    SYM64_THRESHOLD=0 llvm-ar crsT mylib.a hello.o
2. Now try to use any of the tools on this mylib.a and it will fail.
    llvm-nm -M mylib.a

THis fix will eliminate these failures. A regression test is created in llvm/test/Object/archive-symtab.test

Reviewed By: MaskRay, Ramesh

Differential Revision: https://reviews.llvm.org/D107322
2021-08-05 10:06:34 -07:00
Jon Roelofs b4c0307d59 Fix clang-interpreter build after 2487db1f28 2021-08-05 10:05:36 -07:00
Benjamin Kramer bd17ced1db Revert "[X86] combineX86ShuffleChain(): canonicalize mask elts picking from splats"
This reverts commits f819e4c7d0 and
35c0848b57. It triggers an infinite loop during
compilation.

$ cat t.ll
target datalayout = "e-m:e-p270:32:32-p271:32:32-p272:64:64-i64:64-f80:128-n8:16:32:64-S128"
target triple = "x86_64-unknown-linux-gnu"

define void @MaxPoolGradGrad_1.65() local_unnamed_addr #0 {
entry:
  %wide.vec78 = load <64 x i32>, <64 x i32>* null, align 16
  %strided.vec83 = shufflevector <64 x i32> %wide.vec78, <64 x i32> poison, <8 x i32> <i32 4, i32 12, i32 20, i32 28, i32 36, i32 44, i32 52, i32 60>
  %0 = lshr <8 x i32> %strided.vec83, <i32 16, i32 16, i32 16, i32 16, i32 16, i32 16, i32 16, i32 16>
  %1 = add <8 x i32> zeroinitializer, %0
  %2 = shufflevector <8 x i32> %1, <8 x i32> undef, <16 x i32> <i32 0, i32 1, i32 2, i32 3, i32 4, i32 5, i32 6, i32 7, i32 8, i32 9, i32 10, i32 11, i32 12, i32 13, i32 14, i32 15>
  %3 = shufflevector <16 x i32> %2, <16 x i32> undef, <32 x i32> <i32 0, i32 1, i32 2, i32 3, i32 4, i32 5, i32 6, i32 7, i32 8, i32 9, i32 10, i32 11, i32 12, i32 13, i32 14, i32 15, i32 16, i32 17, i32 18, i32 19, i32 20, i32 21, i32 22, i32 23, i32 24, i32 25, i32 26, i32 27, i32 28, i32 29, i32 30, i32 31>
  %interleaved.vec = shufflevector <32 x i32> undef, <32 x i32> %3, <64 x i32> <i32 0, i32 8, i32 16, i32 24, i32 32, i32 40, i32 48, i32 56, i32 1, i32 9, i32 17, i32 25, i32 33, i32 41, i32 49, i32 57, i32 2, i32 10, i32 18, i32 26, i32 34, i32 42, i32 50, i32 58, i32 3, i32 11, i32 19, i32 27, i32 35, i32 43, i32 51, i32 59, i32 4, i32 12, i32 20, i32 28, i32 36, i32 44, i32 52, i32 60, i32 5, i32 13, i32 21, i32 29, i32 37, i32 45, i32 53, i32 61, i32 6, i32 14, i32 22, i32 30, i32 38, i32 46, i32 54, i32 62, i32 7, i32 15, i32 23, i32 31, i32 39, i32 47, i32 55, i32 63>
  store <64 x i32> %interleaved.vec, <64 x i32>* undef, align 16
  unreachable
}

$ llc < t.ll -mcpu=skylake
<hang>
2021-08-05 18:58:08 +02:00
Jessica Paquette f3f3098afe [AArch64][GlobalISel] Mark v16s8 <- v8s8, v8s8 G_CONCAT_VECTOR as legal
G_CONCAT_VECTORS shows up from time to time when legalizing other instructions.

We actually import patterns for the v16s8 <- v8s8, v8s8 case so marking it
as legal gives us selection for free.

Differential Revision: https://reviews.llvm.org/D107512
2021-08-05 09:40:46 -07:00
Daniele Vettorel 180f4a87c5 Add llvm-stress binary to Bazel build configuration.
The `llvm-stress` binary is currently missing from the Bazel `BUILD` file for llvm. This patch adds it.

Reviewed By: GMNGeoffrey

Differential Revision: https://reviews.llvm.org/D107571
2021-08-05 12:35:46 -04:00
Florian Hahn 97469d4c20
[SLP] Add additional memory version tests. 2021-08-05 17:21:10 +01:00
Jennifer Yu 6b0f35931a Fix signal during the call to checkOpenMPLoop.
The root problem is a null pointer is accessed during the call to
checkOpenMPLoop, because loop up bound expr is an error expression
due to error diagnostic was emit early.

To fix this, in setLCDeclAndLB, setUB and setStep instead return false,
return true when LB, UB or Step contains Error, so that the checking is
stopped in checkOpenMPLoop.

Differential Revision: https://reviews.llvm.org/D107385
2021-08-05 08:59:35 -07:00
Kazu Hirata 72661f337a [Transforms] Drop unnecessary const from return types (NFC)
Identified with readability-const-return-type.
2021-08-05 08:53:17 -07:00
Alexey Bataev e7c3eaa8ae [SLP]Do not emit extra shuffle for insertelements vectorization.
If the vectorized insertelements instructions form indentity subvector
(the subvector at the beginning of the long vector), it is just enough
to extend the vector itself, no need to generate inserting subvector
shuffle.

Differential Revision: https://reviews.llvm.org/D107494
2021-08-05 08:41:24 -07:00
Craig Topper f7076cfd3a [DAGCombiner][RISCV][AMDGPU] Call SimplifyDemandedBits at the end of visitMULHU to enable known bits contant folding.
We don't have real demanded bits support for MULHU, but we can
still use the known bits based constant folding support at the end
of SimplifyDemandedBits to simplify a MULHU. This helps with cases
where we know the LHS and RHS have enough leading zeros so that
the high multiply result is always 0.

Reviewed By: RKSimon

Differential Revision: https://reviews.llvm.org/D106471
2021-08-05 08:31:26 -07:00
David Sherwood e9177b0958 Fix build issues caused by 95800da914 2021-08-05 16:26:34 +01:00
Sander de Smalen 3e47f009ff [LV] Consider ExtractValue as uniform.
Since all operands to ExtractValue must be loop-invariant when we deem
the loop vectorizable, we can consider ExtractValue to be uniform.

Reviewed By: david-arm

Differential Revision: https://reviews.llvm.org/D107286
2021-08-05 16:20:50 +01:00
Sean Fertile f888e442bc [PowerPC][AIX] attribute aligned cannot decrease align of a vector var.
On AIX an aligned attribute cannot decrease the alignment of a variable
when placed on a variable declaration of vector type.

Differential Revision: https://reviews.llvm.org/D107522
2021-08-05 11:15:12 -04:00
eopXD fd7f6a3c81 [NFC][LoopIdiom] rename boolean variable NegStride to IsNegStride
Rename variable for better code readability.

Reviewed By: lebedev.ri

Differential Revision: https://reviews.llvm.org/D107570
2021-08-05 23:11:42 +08:00
Jay Foad 2b63933115 [AMDGPU][SDag] Better lowering for 32-bit ctlz/cttz
Differential Revision: https://reviews.llvm.org/D107566
2021-08-05 15:57:40 +01:00
Jay Foad e6c364a624 [AMDGPU][SDag] Better lowering for 64-bit ctlz/cttz
Differential Revision: https://reviews.llvm.org/D107546
2021-08-05 15:57:40 +01:00
Dmitry Vyukov 35816163f2 tsan: pass thr/pc to MemoryResetRange
Pass thr/pc args to MemoryResetRange as we do everywhere.
Currently they are unused by MemoryResetRange,
but there is no reason to be inconsistent.

Depends on D107562.

Reviewed By: melver

Differential Revision: https://reviews.llvm.org/D107563
2021-08-05 16:57:02 +02:00
Dmitry Vyukov c6a485caf6 tsan: qualify autos
clang-tidy warning requires qualifying auto pointers:

clang-tidy: warning: 'auto ctx' can be declared as 'auto *ctx' [llvm-qualified-auto]

Fix remaing cases we have in tsan.

Depends on D107561.

Reviewed By: melver

Differential Revision: https://reviews.llvm.org/D107562
2021-08-05 16:56:47 +02:00
Dmitry Vyukov cb7b0a5f34 tsan: don't include tsan_interceptors.h for Go
None of the interceptors machinery is used/enabled for Go,
so don't include the header, it's not needed (must not be).
The problem is that we have fields in ThreadState that are
not present in the Go build, so changes in thread_interceptors.h
can cause Go build breakages due to missing fields.

Reviewed By: melver

Differential Revision: https://reviews.llvm.org/D107561
2021-08-05 16:56:28 +02:00
Momchil Velikov f171149e0d [SimpifyCFG] Speculate a store preceded by a local non-escaping load
In SimplifyCFG we may simplify the CFG by speculatively executing
certain stores, when they are preceded by a store to the same
location.  This patch allows such speculation also when the stores are
similarly preceded by a load.

In order for this transformation to be correct we need to ensure that
the memory location is writable and the store in the new location does
not introduce a data race.

Local objects (created by an `alloca` instruction) are always
writable, so once we are past a read from a location it is valid to
also write to that same location.

Seeing just a load does not guarantee absence of a data race (unlike
if we see a store) - the load may still be part of a race, just not
causing undefined behaviour
(cf. https://llvm.org/docs/Atomics.html#optimization-outside-atomic).

In the original program, a data race might have been prevented by the
condition, but once we move the store outside the condition, we must
be sure a data race wasn't possible anyway, no matter what the
condition evaluates to.

One way to be sure that a local object is never concurrently
read/written is check that its address never escapes the function.

Hence this transformation is restricted to local, non-escaping
objects.

Reviewed By: nikic, lebedev.ri

Differential Revision: https://reviews.llvm.org/D107281
2021-08-05 15:54:42 +01:00
Dmitry Vyukov fc545c52cd tsan: handle bugs in symbolizer more gracefully
For symbolizer we only process SIGSEGV signals synchronously
(which means bug in symbolizer or in tsan).
But we still want to reset in_symbolizer to fail gracefully.
Symbolizer and user code use different memory allocators,
so if we don't reset in_symbolizer we can get memory allocated
with one being feed with another, which can cause more crashes.

Reviewed By: melver

Differential Revision: https://reviews.llvm.org/D107564
2021-08-05 16:53:15 +02:00
Dmitry Vyukov 15eb431537 tsan: modernize MaybeReportThreadLeak
Use C++ casts and auto.
Rename to CollectThreadLeaks b/c it's only collecting, not reporting.

Reviewed By: melver

Differential Revision: https://reviews.llvm.org/D107568
2021-08-05 16:52:41 +02:00
Michał Górny d99e9461b0 [clang] [clang-repl] Fix linking against LLVMLineEditor
LLVMLineEditor library is part of the LLVM dylib.  Move it into
LLVM_LINK_COMPONENTS to avoid duplicate linking when dylib is being
used.  This fixes building standalone clang against installed LLVM
without static libraries.

Differential Revision: https://reviews.llvm.org/D107558
2021-08-05 16:51:47 +02:00
Simon Pilgrim 2cbf9fd402 [DAG] DAGCombiner::visitVECTOR_SHUFFLE - recognise INSERT_SUBVECTOR patterns
IR typically creates INSERT_SUBVECTOR patterns as a widening of the subvector with undefs to pad to the destination size, followed by a shuffle for the actual insertion - SelectionDAGBuilder has to do something similar for shuffles when source/destination vectors are different sizes.

This combine attempts to recognize these patterns by looking for a shuffle of a subvector (from a CONCAT_VECTORS) that starts at a modulo of its size into an otherwise identity shuffle of the base vector.

This uncovered a couple of target-specific issues as we haven't often created INSERT_SUBVECTOR nodes in generic code - aarch64 could only handle insertions into the bottom of undefs (i.e. a vector widening), and x86-avx512 vXi1 insertion wasn't keeping track of undef elements in the base vector.

Fixes PR50053

Differential Revision: https://reviews.llvm.org/D107068
2021-08-05 15:40:48 +01:00
Florian Hahn 38b098be66
[VectorCombine] Limit scalarization known non-poison indices.
We can only trust the range of the index if it is guaranteed
non-poison.

Fixes PR50949.

Reviewed By: lebedev.ri

Differential Revision: https://reviews.llvm.org/D107364
2021-08-05 15:36:31 +01:00
Dawid Jurczak 06206a8cd1 [BuildLibCalls][NFC] Remove redundant attribute list from emitCalloc
Additionally with this patch aligned DSE which is the only user of emitCalloc.

Differential Revision: https://reviews.llvm.org/D103523
2021-08-05 16:18:38 +02:00
David Sherwood 95800da914 [LoopVectorize] Add support for replication of more intrinsics with scalable vectors
This patch adds more instructions to the Uniforms list, for example certain
intrinsics that are uniform by definition or whose operands are loop invariant.
This list includes:

  1. The intrinsics 'experimental.noalias.scope.decl' and 'sideeffect', which
  are always uniform by definition.
  2. If intrinsics 'lifetime.start', 'lifetime.end' and 'assume' have
  loop invariant input operands then these are also uniform too.

Also, in VPRecipeBuilder::handleReplication we check if an instruction is
uniform based purely on whether or not the instruction lives in the Uniforms
list. However, there are certain cases where calls to some intrinsics can
be effectively treated as uniform too. Therefore, we now also treat the
following cases as uniform for scalable vectors:

  1. If the 'assume' intrinsic's operand is not loop invariant, then we
  are free to treat this as uniform anyway since it's only a performance
  hint. We will get the benefit for the first lane.
  2. When the input pointers for 'lifetime.start' and 'lifetime.end' are loop
  variant then for scalable vectors we assume these still ultimately come
  from the broadcast of an alloca. We do not support scalable vectorisation
  of loops containing alloca instructions, hence the alloca itself would
  be invariant. If the pointer does not come from an alloca then the
  intrinsic itself has no effect.

I have updated the assume test for fixed width, since we now treat it
as uniform:

  Transforms/LoopVectorize/assume.ll

I've also added new scalable vectorisation tests for other intriniscs:

  Transforms/LoopVectorize/scalable-assume.ll
  Transforms/LoopVectorize/scalable-lifetime.ll
  Transforms/LoopVectorize/scalable-noalias-scope-decl.ll

Differential Revision: https://reviews.llvm.org/D107284
2021-08-05 15:17:27 +01:00
Fanbo Meng 91e3995195 Revert "[SystemZ][z/OS] Update target specific __attribute__((aligned)) value for test"
This reverts commit d91234b21c.

Reviewed By: abhina.sreeskantharajan

Differential Revision: https://reviews.llvm.org/D107565
2021-08-05 10:14:02 -04:00
Dawid Jurczak f8cdde7195 [SimplifyLibCalls][NFC] Clean up LibCallSimplifier from 'memset + malloc into calloc' transformation
FoldMallocMemset can be safely removed because since https://reviews.llvm.org/D103009
such transformation is already performed in DSE.

Differential Revision: https://reviews.llvm.org/D103451
2021-08-05 16:08:32 +02:00
Krzysztof Parzyszek d0c3b61498 Delay initialization of OptBisect
When LLVM is used in other projects, it may happen that global cons-
tructors will execute before the call to ParseCommandLineOptions.
Since OptBisect is initialized via a constructor, and has no ability
to be updated at a later time, passing "-opt-bisect-limit" to the
parse function may have no effect.

To avoid this problem use a cl::cb (callback) to set the bisection
limit when the option is actually processed.

Differential Revision: https://reviews.llvm.org/D104551
2021-08-05 09:04:17 -05:00