Commit Graph

396915 Commits

Author SHA1 Message Date
Bjorn Pettersson 36d5138619 [NewPM] Make some sanitizer passes parameterized in the PassRegistry
Refactored implementation of AddressSanitizerPass and
HWAddressSanitizerPass to use pass options similar to passes like
MemorySanitizerPass. This makes sure that there is a single mapping
from class name to pass name (needed by D108298), and options like
-debug-only and -print-after makes a bit more sense when (despite
that it is the unparameterized pass name that should be used in those
options).

A result of the above is that some pass names are removed in favor
of the parameterized versions:
- "khwasan" is now "hwasan<kernel;recover>"
- "kasan" is now "asan<kernel>"
- "kmsan" is now "msan<kernel>"

Differential Revision: https://reviews.llvm.org/D105007
2021-08-19 12:43:37 +02:00
Yaron Keren 23b16d2453 [docs] Document that psutil should be installed in non-user location
Differential Revision: https://reviews.llvm.org/D108356
2021-08-19 13:42:31 +03:00
Renato Golin 894ad26bd5 Update {Small}BitVector size_type definition
SmallBitVector implements a level of indirection over BitVector by
storing a smaller bit-vector in a pointer-sized element, or in case the
number of elements exceeds the bucket size, it creates a new pointer to
a BitVector and uses that as its storage.

However, the functions returning the vector size were using `unsigned`,
which is ok for BitVector, but not for SmallBitVector, which is actually
`uintptr_t`.

This commit reuses the `size_type` definition to more than just `count`
and propagates them into range iteration, size calculation, etc.

This is a continuation of D108124.

I haven't changed all occurrences of `unsigned` or `uintptr_t` to
`size_type`, just those that were directly related.

Following directions from clang-tidy on case of variables.

Differential Revision: https://reviews.llvm.org/D108290
2021-08-19 11:13:38 +01:00
Andrzej Warzynski dcc6b7b1d5 [OptTable] Refine how `printHelp` treats empty help texts
Currently, `printHelp` behaves differently for options that:
  * do not define `HelpText` (such options _are not printed_), and
  * define its `HelpText` as `HelpText<"">` (such options _are printed_).
In practice, both approaches lead to no help text and `printHelp` should
treat them consistently. This patch addresses that by making
`printHelpt` check the length of the help text to be printed.

All affected tests have been updated accordingly. The option definitions
for llvm-cvtres have been updated with a short description or "Not
  implemented" for options that are ignored by the tool.

Differential Revision: https://reviews.llvm.org/D107557
2021-08-19 09:30:15 +00:00
Martin Storsjö cc3affd8b0 [clang] [MSVC] Implement __mulh and __umulh builtins for aarch64
The code is based on the same __mulh and __umulh intrinsics for
x86.

This should fix PR51128.

Differential Revision: https://reviews.llvm.org/D106721
2021-08-19 11:29:55 +03:00
David Sherwood f4122398e7 [LoopVectorize][AArch64] Enable ordered reductions by default for AArch64
I have added a new TTI interface called enableOrderedReductions() that
controls whether or not ordered reductions should be enabled for a
given target. By default this returns false, whereas for AArch64 it
returns true and we rely upon the cost model to make sensible
vectorisation choices. It is still possible to override the new TTI
interface by setting the command line flag:

  -force-ordered-reductions=true|false

I have added a new RUN line to show that we use ordered reductions by
default for SVE and Neon:

  Transforms/LoopVectorize/AArch64/strict-fadd.ll
  Transforms/LoopVectorize/AArch64/scalable-strict-fadd.ll

Differential Revision: https://reviews.llvm.org/D106653
2021-08-19 09:29:40 +01:00
Stuart Ellis 520e5db26a [flang][driver] Add print function name Plugin example
Replacing Hello World example Plugin with one that counts and prints the names of
functions and subroutines.
This involves changing the `PluginParseTreeAction` Plugin base class to
inherit from `PrescanAndSemaAction` class to get access to the Parse Tree
so that the Plugin can walk it.
Additionally, there are tests of this new Plugin to check it prints the correct
things in different circumstances.

Depends on: D106137

Reviewed By: awarzynski

Differential Revision: https://reviews.llvm.org/D107089
2021-08-19 08:25:34 +00:00
Matthias Springer 8e8b70aa84 [mlir][scf] Simplify affine.min ops after loop peeling
Simplify affine.min ops, enabling various other canonicalizations inside the peeled loop body.

affine.min ops such as:
```
map = affine_map<(d0)[s0, s1] -> (s0, -d0 + s1)>
%r = affine.min #affine.min #map(%iv)[%step, %ub]
```
are rewritten them into (in the case the peeled loop):
```
%r = %step
```

To determine how an affine.min op should be rewritten and to prove its correctness, FlatAffineConstraints is utilized.

Differential Revision: https://reviews.llvm.org/D107222
2021-08-19 17:24:53 +09:00
Diana Picus 3330b2532f [flang] Add POSIX implementation for SYSTEM_CLOCK
This is very similar to CPU_TIME, except that we return nanoseconds
rather than seconds. This means we're potentially dealing with rather
large numbers, so we'll have to wrap around to avoid overflows.

Differential Revision: https://reviews.llvm.org/D105970
2021-08-19 07:39:37 +00:00
Christian Sigg 81d5412439 Simplify setting up LLVM as bazel external repo
Only require one intermediate repository instead of two.
Fewer parameters in llvm_config.

Second attempt of https://reviews.llvm.org/D107714, this time also updating `third_party_build` and `deps_impl` paths.

Reviewed By: GMNGeoffrey

Differential Revision: https://reviews.llvm.org/D108274
2021-08-19 09:37:26 +02:00
John Demme 96fbd5cd5e [MLIR] [Python] Add `owner` to `mlir.ir.Block`
Provides a way for python users to access the owning Operation from a Block.
2021-08-19 00:02:09 -07:00
Tobias Gysi 234c4d2362 [mlir][linalg] Set result types in all builders.
Add code to set the result types in all yaml op builders.

Reviewed By: nicolasvasilache

Differential Revision: https://reviews.llvm.org/D108273
2021-08-19 06:19:12 +00:00
Wenlei He eca03d2768 [CSSPGO] Track and use context-sensitive post-optimization function size to drive global pre-inliner in llvm-profgen
This change enables llvm-profgen to use accurate context-sensitive post-optimization function byte size as a cost proxy to drive global preinline decisions.

To do this, BinarySizeContextTracker is introduced to track function byte size under different inline context during disassembling. In preinliner, we can not query context byte size under switch `context-cost-for-preinliner`. The tracker uses a reverse trie to keep size of functions under different context (callee as parent, caller as child), and it can give best/longest possible matching context size for given input context.

The new size cost is off by default. There're a few TODOs that needs to addressed: 1) avoid dangling string from `Offset2LocStackMap`, which will be addressed in split context work; 2) using inlinee's entry probe to make sure we have correct zero size for inlinee that's completely optimized away after inlining. Some tuning is also needed.

Differential Revision: https://reviews.llvm.org/D108180
2021-08-18 22:50:57 -07:00
Anshil Gandhi f5d5f17d3a Revert "[HIP] Allow target addr space in target builtins"
This reverts commit a35008955f.
2021-08-18 21:38:42 -06:00
Lang Hames da83b70a6f [examples] Fix Kaleidoscope for Windows
This fixes "Resolving symbol with incorrect flags" errors when running the
Kaleidoscope tutorials on Windows.

Reviewed By: lhames

Differential Revision: https://reviews.llvm.org/D108348
2021-08-19 13:20:51 +10:00
Sam Clegg e4888be74e [WebAssembly] Avoid unused function imports in PIC mode
In PIC mode we import function address via `GOT.mem` imports but for
direct function calls we still import the first class function.
However, if the function is never directly called we can avoid the first
class import completely.

Differential Revision: https://reviews.llvm.org/D108345
2021-08-18 22:31:04 -04:00
luxufan a9095f005f [JITLink] Optimize GOTPCRELX Relocations
This patch optimize the GOTPCRELX Reloations, which is described in X86-64 psabi chapter B.2. And Not all optimization of this chapter is implemented.

1. Convert call and jmp has been implemented
2. Convert mov, but the optimization that when the symbol is defined in the lower 32-bit address space, memory operand in `mov` can be convertted into immediate operand has not been implemented.
3. Conver Test and Binop has not been implemented.

The new test file named ELF_got_plt_optimizations.s has been added, and I moved some test cases about optimization of got/plt from ELF_x86_64_small_pic_relocations.s to the new test file.

By referencing the lld, so, the optimization `Convert call and jmp` is not same as what psabi says, and I have explained it in the comment.

Reviewed By: lhames

Differential Revision: https://reviews.llvm.org/D108280
2021-08-19 10:30:22 +08:00
Matthias Springer 08dbed8a57 [mlir][linalg] Canonicalize dim ops of tiled_loop block args
E.g.:
```
%y = ... : tensor<...>
linalg.tiled_loop ... ins(%x = %y : tensor<...>) {
  tensor.dim %x, %c0 : tensor<...>
}
```

is rewritten to:
```
%y = ... : tensor<...>
linalg.tiled_loop ... ins(%x = %y : tensor<...>) {
  tensor.dim %y, %c0 : tensor<...>
}
```

Differential Revision: https://reviews.llvm.org/D108272
2021-08-19 11:24:33 +09:00
Lang Hames 8a36750236 [ORC] Handle void and no-argument async wrapper calls. 2021-08-19 12:20:31 +10:00
Sam Clegg 12b1dc0467 [WebAssembly][lld] Convert signature-mismatch.ll test to asm. NFC
Differential Revision: https://reviews.llvm.org/D108346
2021-08-18 22:17:02 -04:00
Vitaly Buka 03bd05f0e8 [sanitizer] Use TMPDIR in Android test
TMPDIR was added long time ago, so no need to use EXTERNAL_STORAGE.
2021-08-18 19:05:21 -07:00
Matthias Springer 9329438244 [mlir][linalg] Remove ConstraintsSet class
The same functionality can be implemented with FlatAffineValueConstraints.

Differential Revision: https://reviews.llvm.org/D108179
2021-08-19 10:57:35 +09:00
LLVM GN Syncbot fe658c3f6e [gn build] Port 5fdaaf7fd8 2021-08-19 01:52:47 +00:00
Peter Collingbourne 6f85225ef3 StackLifetime: Remove asserts for multiple lifetime intrinsics.
According to the langref, it is valid to have multiple consecutive
lifetime start or end intrinsics on the same object.

For llvm.lifetime.start:
"If ptr [...] is a stack object that is already alive, it simply
fills all bytes of the object with poison."

For llvm.lifetime.end:
"Calling llvm.lifetime.end on an already dead alloca is no-op."

However, we currently fail an assertion in such cases. I've observed
the assertion failure when the loop vectorization pass duplicates
the intrinsic.

We can conservatively handle these intrinsics by ignoring all but
the first one, which can be implemented by removing the assertions.

Differential Revision: https://reviews.llvm.org/D108337
2021-08-18 18:45:28 -07:00
Rong Xu 5fdaaf7fd8 [SampleFDO] Flow Sensitive Sample FDO (FSAFDO) profile loader
This patch implements Flow Sensitive Sample FDO (FSAFDO) profile
loader. We have two profile loaders for FS profile,
one before RegAlloc and one before BlockPlacement.

To enable it, when -fprofile-sample-use=<profile> is specified,
add "-enable-fs-discriminator=true \
     -disable-ra-fsprofile-loader=false \
     -disable-layout-fsprofile-loader=false"
to turn on the FS profile loaders.

Differential Revision: https://reviews.llvm.org/D107878
2021-08-18 18:37:35 -07:00
Matthias Springer c777e51468 [mlir][Analysis][NFC] FlatAffineConstraints: Use BoundType enum in functions
Differential Revision: https://reviews.llvm.org/D108185
2021-08-19 10:33:42 +09:00
Vitaly Buka 3d4d1b9b29 [scudo] Don't build SCUDO for Android
Android 11 uses scudo_standalone as default
allocator making difficult to test legacy scudo.
2021-08-18 18:32:54 -07:00
Jon Chesterfield dbd7bad9ad [openmp] Annotate tmp variables with omp_thread_mem_alloc
Fixes miscompile of calls into ocml. Bug 51445.

The stack variable `double __tmp` is moved to dynamically allocated shared
memory by CGOpenMPRuntimeGPU. This is usually fine, but when the variable
is passed to a function that is explicitly annotated address_space(5) then
allocating the variable off-stack leads to a miscompile in the back end,
which cannot decide to move the variable back to the stack from shared.

This could be fixed by removing the AS(5) annotation from the math library
or by explicitly marking the variables as thread_mem_alloc. The cast to
AS(5) is still a no-op once IR is reached.

Reviewed By: jdoerfert

Differential Revision: https://reviews.llvm.org/D107971
2021-08-19 02:22:11 +01:00
Kyungwoo Lee 829616c241 [NFC][DebugInfo] getDwarfCompileUnitID
This is a refactoring for the use in https://reviews.llvm.org/D108261

Reviewed By: dblaikie

Differential Revision: https://reviews.llvm.org/D108271
2021-08-18 17:35:03 -07:00
Jon Chesterfield f420939b82 [libomptarget] Apply D106710 to amdgcn devicertl 2021-08-19 01:34:33 +01:00
Aart Bik d37d72eaf8 [mlir][sparse] use shared util for DimOp generation
This shares more code with existing utilities. Also, to be consistent,
we moved dimension permutation on the DimOp to the tensor lowering phase.
This way, both pre-existing DimOps on sparse tensors (not likely but
possible) as well as compiler generated DimOps are handled consistently.

Reviewed By: bixia

Differential Revision: https://reviews.llvm.org/D108309
2021-08-18 17:12:32 -07:00
Jon Chesterfield c480792b6a [libomptarget][nfc][devicertl] Delete unused enums 2021-08-19 00:14:34 +01:00
Daniel McIntosh f6ba6c3976 [NFC][libcxxabi] Run clang-format on libcxxabi/src/cxa_guard_impl.h
I'm about to submit a change which involves re-writing most of
cxa_guard_impl.h. Running clang-format on the whole file first seems like a
good idea.

Reviewed By: ldionne, #libc_abi

Differential Revision: https://reviews.llvm.org/D108231
2021-08-18 19:09:16 -04:00
Diego Caballero b7cac864b2 [mlir] Fix typo in SuperVectorizer
NFC.

Reviewed By: aartbik

Differential Revision: https://reviews.llvm.org/D108334
2021-08-18 22:55:12 +00:00
Omar Emara 82507f1798 [LLDB][GUI] Add Process Launch form
This patch adds a process launch form. Additionally, a LazyBoolean field
was implemented and numerous utility methods were added to various
fields to get the launch form working.

Differential Revision: https://reviews.llvm.org/D107869
2021-08-18 15:43:30 -07:00
owenca 643f2be7b6 [clang-format] Improve detection of parameter declarations in K&R C
Clean up the detection of parameter declarations in K&R C function
definitions. Also make it more precise by requiring the second
token after the r_paren to be either a star or keyword/identifier.

Differential Revision: https://reviews.llvm.org/D108094
2021-08-18 15:21:48 -07:00
Omar Emara 698e210636 [LLDB][GUI] Fix text field incorrect key handling
The isprint libc function was used to determine if the key code
represents a printable character. The problem is that the specification
leaves the behavior undefined if the key is not representable as an
unsigned char, which is the case for many ncurses keys. This patch adds
and explicit check for this undefined behavior and make it consistent.

The llvm::isPrint function didn't work correctly for some reason, most
likely because it takes a char instead of an int, which I guess makes it
unsuitable for checking ncurses key codes.

Reviewed By: clayborg

Differential Revision: https://reviews.llvm.org/D108327
2021-08-18 15:06:27 -07:00
LLVM GN Syncbot a0ed44943a [gn build] Port d8bbfe8a48 2021-08-18 21:58:30 +00:00
Rafael Auler d8bbfe8a48 [DWARF] Expose raw bytes in DWARFExpression
This information is necessary for clients of DebugInfo that
do not want to process a DWARF expression, but just treat it as a blob
of data. In BOLT, for example, we need to read these expressions in
CFIs and write them back to the binary, unchanged, so having access to
the original expression encoding is a shortcut to avoid the need to
re-encode the entire expression when re-writing exception handling
info (CFIs).

This patch is an alternative to https://reviews.llvm.org/D98301, in
which we implement the support to re-encode these expressions. But
since we don't really need to change anything in these expressions,
we can just copy their bytes.

Reviewed By: aprantl

Differential Revision: https://reviews.llvm.org/D107515
2021-08-18 14:41:20 -07:00
Chia-hung Duan 41e5dbe0fa Enables inferring return types for Shape op if possible
Reviewed By: jpienaar

Differential Revision: https://reviews.llvm.org/D102565
2021-08-18 21:36:55 +00:00
Jessica Paquette c22b64ef66 [AArch64][GlobalISel] Don't allow s128 for G_ISNAN
getAPFloatFromSize doesn't support s128, so we can't lower this without
asserting right now.

To fix the buildbots, don't allow any scalars other than s16, s32, and s64.
2021-08-18 13:59:00 -07:00
Peter Collingbourne b2e77cd095 gn build: Build libclang.so and libLTO.so on ELF platforms.
This requires changing the ELF build to enable -fPIC, consistent
with other platforms.

Differential Revision: https://reviews.llvm.org/D108223
2021-08-18 13:48:33 -07:00
Jessica Paquette 3d91d5b757 [AArch64][GlobalISel] Mark G_FMINNUM/G_FMAXNUM as floating point opcodes
We need to ensure that these end up on FPR to allow imported patterns to
select them.

This will also ensure that we get good regbank selection when dealing with
instructions like G_PHI/G_LOAD/G_STORE which deduce their banks from their
uses/users.

Differential Revision: https://reviews.llvm.org/D108260
2021-08-18 13:32:19 -07:00
Jessica Paquette 45e1a6bd25 [AArch64][GlobalISel] Legalize scalar G_FMINNUM + G_FMAXNUM
For subtargets with full FP16, this is legal for s16, s32, and s64. Without
full FP16, it's legal for s32 and s64.

For s128, this is a libcall.

We also support some vector types, but for now, let's just support scalars.

Differential Revision: https://reviews.llvm.org/D108259
2021-08-18 13:30:03 -07:00
Jon Chesterfield 21d91a8ef3 [libomptarget][devicertl] Replace lanemask with uint64 at interface
Use uint64_t for lanemask on all GPU architectures at the interface
with clang. Updates tests. The deviceRTL is always linked as IR so the zext
and trunc introduced for wave32 architectures will fold after inlining.

Simplification partly motivated by amdgpu gfx10 which will be wave32 and
is awkward to express in the current arch-dependant typedef interface.

Reviewed By: jdoerfert

Differential Revision: https://reviews.llvm.org/D108317
2021-08-18 20:47:33 +01:00
Anton Afanasyev cfb6dfcbd1 [AggressiveInstCombine] Add logical shift right instr to `TruncInstCombine` DAG
Add `lshr` instruction to the DAG post-dominated by `trunc`, allowing
TruncInstCombine to reduce bitwidth of expressions containing
these instructions.

We should be shifting by less than the target bitwidth.
Also it is sufficient to require that all truncated bits
of the value-to-be-shifted are zeros: https://alive2.llvm.org/ce/z/_LytbB

Alive2 variable-length proof:
https://godbolt.org/z/1srE1aqzf => s/32/8/ => https://alive2.llvm.org/ce/z/StwPia

Part of https://reviews.llvm.org/D107766

Differential Revision: https://reviews.llvm.org/D108201
2021-08-18 22:20:58 +03:00
Anton Afanasyev 2498c3edcd [Test][AggressiveInstCombine] Add one more tests for shifts 2021-08-18 22:20:57 +03:00
Robert Suderman 76c9712196 [mlir][tosa] Fix clamp to restrict only within valid bitwidth range
Its possible for the clamp to have invalid min/max values on its range. To fix
this we validate the range of the min/max and clamp to a valid range.

Reviewed By: NatashaKnk

Differential Revision: https://reviews.llvm.org/D108256
2021-08-18 12:14:01 -07:00
Michael Kruse 58e4e71fc8 [Polly] Introduce caching for the isErrorBlock function. NFC.
Compilation of the file insn-attrtab.c of the SPEC CPU 2017 502.gcc_r
benchmark takes excessive time (> 30min) with Polly enabled. Most time
is spent in the isErrorBlock function querying the DominatorTree.
The isErrorBlock is invoked redundantly over the course of ScopDetection
and ScopBuilder. This patch introduces a caching mechanism for its
result.

Instead of a free function, isErrorBlock is moved to ScopDetection where
its cache map resides. This also means that many functions directly or
indirectly calling isErrorBlock are not "const" anymore. The
DetectionContextMap was marked as "mutable", but IMHO it never should
have been since it stores the detection result.

502.gcc_r only takes excessive time with the new pass manager. The
reason seeams to be that it invalidates the ScopDetection analysis more
often than the legacy pass manager, for unknown reasons.
2021-08-18 14:05:50 -05:00
Ali Sedaghati cc7bcef3e3 Reapply: [NFC] factor out unrolling decision logic
reverting ffd8a268bd (reapplying
4d559837e8) - removed spurious inclusion
of <optional>

Differential Revision: https://reviews.llvm.org/D106001
2021-08-18 12:04:33 -07:00