Commit Graph

381001 Commits

Author SHA1 Message Date
Christian Sigg f03826f896 Pass GPU events instead of streams across async regions.
Lower !gpu.async.tokens returned from async.execute regions to events instead of streams.

Make !gpu.async.token returned from !async.execute single-use.
This allows creating one event per use and destroying them without leaking or ref-counting.
Technically we only need this for stream/event-based lowering. I kept the code separate
from the rest of the gpu-async-region pass so that we can make this optional or move
to a separate pass as needed.

Reviewed By: herhut

Differential Revision: https://reviews.llvm.org/D96965
2021-02-25 13:18:18 +01:00
Fraser Cormack 84413e1947 [RISCV] Support fixed-length vector truncates
This patch extends support for our custom-lowering of scalable-vector
truncates to include those of fixed-length vectors. It does this by
co-opting the custom RISCVISD::TRUNCATE_VECTOR node and adding mask and
VL operands. This avoids unnecessary duplication of patterns and
inflation of the ISel table.

Some truncates go through CONCAT_VECTORS which currently isn't
efficiently handled, as it goes through the stack. This can be improved
upon in the future.

Reviewed By: craig.topper

Differential Revision: https://reviews.llvm.org/D97202
2021-02-25 12:11:34 +00:00
Fraser Cormack 3bc5ed3875 [RISCV] Support fixed-length vector sign/zero extension
This patch adds support for the custom lowering sign- and zero-extension
of fixed-length vector types. It does so through custom nodes. Since the
source and destination types are (necessarily) of different sizes, it is
possible that the source type is legal whilst the larger destination
type isn't. In this case the legalization makes heavy use of
EXTRACT_SUBVECTOR.

Reviewed By: craig.topper

Differential Revision: https://reviews.llvm.org/D97194
2021-02-25 12:05:17 +00:00
Fraser Cormack 821f8bb29a [RISCV] Unify scalable- and fixed-vector EXTRACT_SUBVECTOR lowering
This patch unifies the two disparate paths for lowering
EXTRACT_SUBVECTOR operations under one roof. Consequently, with this
patch it is possible to support any fixed-length subvector extraction,
not just "cast-like" ones.

Reviewed By: craig.topper

Differential Revision: https://reviews.llvm.org/D97192
2021-02-25 11:46:57 +00:00
Evgeniy Brevnov d0a6f8bb65 [NFC] Fix build failure after 83d134c3c4 2021-02-25 18:43:00 +07:00
Simon Pilgrim 0d835ba48d [X86] Regenerate sdiv_fix.ll tests. NFCI. 2021-02-25 11:37:46 +00:00
Evgeniy Brevnov 83d134c3c4 [NARY-REASSOCIATE] Support reassociation of min/max
Support reassociation for min/max. With that we should be able to transform min(min(a, b), c) -> min(min(a, c), b) if min(a, c) is already available.

Reviewed By: mkazantsev

Differential Revision: https://reviews.llvm.org/D88287
2021-02-25 18:22:39 +07:00
Simon Pilgrim 8b82669d56 [X86][SSE] Move unaryshuffle(xor(x,-1)) -> xor(unaryshuffle(x),-1) fold into helper. NFCI.
We should be able to extend this "canonicalizeShuffleWithBinOps" to handle more generic binop cases where either/both operands can be cheaply shuffled.
2021-02-25 10:56:23 +00:00
serge-sans-paille f0e4610572 Support standalone build of clang-tidy unittest
Apply the same pattern as the one used in clangd/unittests/CMakeLists.txt

Differential Revision: https://reviews.llvm.org/D96788
2021-02-25 11:51:12 +01:00
Raphael Isemann 2d6b767c1d [lldb][NFC] Remove some obsolete comments in ClangASTImporter.cpp
The first two comments are incomplete and reference obsolete code. The
last one is just commented out code (that also doesn't look correct).
2021-02-25 11:44:19 +01:00
Raphael Isemann 7cfa6e1cc6 [lldb] Let ClangASTImporter assert that the target AST has an external source
This prevents people from accidentially using this code outside the
intended setup.
2021-02-25 11:42:14 +01:00
Harmen Stoppels a54f160b3a Prefer /usr/bin/env xxx over /usr/bin/xxx where xxx = perl, python, awk
Allow users to use a non-system version of perl, python and awk, which is useful
in certain package managers.

Reviewed By: JDevlieghere, MaskRay

Differential Revision: https://reviews.llvm.org/D95119
2021-02-25 11:32:27 +01:00
David Sherwood 87dbcd8865 [CodeGen] Canonicalise adds/subs of i1 vectors using XOR
When calling SelectionDAG::getNode() to create an ADD or SUB
of two vectors with i1 element types we can canonicalise this
to use XOR instead, where 1+1 is treated as wrapping around
to 0 and 0-1 wraps to 1.

I've added the following tests for SVE targets:

  CodeGen/AArch64/sve-pred-arith.ll

and modified some X86 tests to reflect the much simpler codegen
required.

Differential Revision: https://reviews.llvm.org/D97276
2021-02-25 10:31:26 +00:00
Tim Northover 201ada80ee AArch64: relax address-space assertion in FastISel.
Some people are using alternative address spaces to track GC data, but
otherwise they behave exactly the same. This is the only place in the backend
we even try to care about it so it's really not achieving anything.
2021-02-25 10:15:55 +00:00
Jan Svoboda d748908fa0 [clang][cli] Round-trip the whole CompilerInvocation
Finally, this patch moves from round-tripping one `CompilerInvocation` at a time to round-tripping the invocation as a whole.

This patch includes only the code required to make round-tripping the whole invocation work. More cleanups will be done in a follow-up patch.

Depends on D96847, D97041 & D97042.

Reviewed By: dexonsmith

Differential Revision: https://reviews.llvm.org/D96280
2021-02-25 11:02:49 +01:00
Jan Svoboda a25e4a6da3 [clang][cli] Store additional optimization remarks info
After a revision of D96274 changed `DiagnosticOptions` to not store all remark arguments **as-written**, it is no longer possible to reconstruct the arguments accurately from the class.

This is caused by the fact that for `-Rpass=regexp` and friends, `DiagnosticOptions` store only the group name `pass` and not `regexp`. This is the same representation used for the plain `-Rpass` argument.

Note that each argument must be generated exactly once in `CompilerInvocation::generateCC1CommandLine`, otherwise each subsequent call would produce more arguments than the previous one. Currently this works out because of the way `RoundTrip` splits the responsibilities for certain arguments based on what arguments were queried during parsing. However, this invariant breaks when we move to single round-trip for the whole `CompilerInvocation`.

This patch ensures that for one `-Rpass=regexp` argument, we don't generate two arguments (`-Rpass` from `DiagnosticOptions` and `-Rpass=regexp` from `CodeGenOptions`) by shifting the responsibility for handling both cases to `CodeGenOptions`. To distinguish between the cases correctly, additional information is stored in `CodeGenOptions`.

The `CodeGenOptions` parser of `-Rpass[=regexp]` arguments also looks at `-Rno-pass` and `-R[no-]everything`, which is necessary for generating the correct argument regardless of the ordering of `CodeGenOptions`/`DiagnosticOptions` parsing/generation.

Reviewed By: dexonsmith

Differential Revision: https://reviews.llvm.org/D96847
2021-02-25 11:02:49 +01:00
Stelios Ioannou 30cb9c03b5 [AArch64] Add abs intrinsic costs
This patch adds cost-modelling for abs vector intrinsic.

Change-Id: I89007971bfb15f5b4a02a2eadfd43018e9a73976
2021-02-25 09:31:52 +00:00
Haojian Wu b218f7c4ba [clangd] NFC, remove an extra "class" keyword. 2021-02-25 09:32:36 +01:00
Jan Svoboda d8a8e5d624 [clang][cli] Remove marshalling from Opt{In,Out}FFlag
We can now express all marshalling semantics in `Opt{In,Out}FFlag` via `BoolFOption`.

This patch moves remaining `Opt{In,Out}FFlag` instances using marshalling to `BoolFOption` and removes marshalling capabilities from `Opt{In,Out}FFlag` entirely.

This simplifies the decisions developers have to make when creating new boolean options:
  * For simple cc1 flag pairs, use `Bool{,F,G}Option`.
  * For cc1 flag pairs that require complex marshalling logic, use `Opt{In,Out}FFlag` and implement marshalling manually.

Reviewed By: dexonsmith

Differential Revision: https://reviews.llvm.org/D97370
2021-02-25 08:53:58 +01:00
Jan Svoboda 88e45f00c1 [clang][cli] Add MarshallingInfoEnum multiclass
This patch introduces a tablegen multiclass called `MarshallingInfoEnum`. It has the same semantics as `MarshallingInfoString` had in combination with `AutoNormalizeEnum`, but it's easier to use and follows the convention used for other `MarshallingInfoXxx` multiclasses.

Reviewed By: dexonsmith

Differential Revision: https://reviews.llvm.org/D97375
2021-02-25 08:47:18 +01:00
Marius Brehler 2d870a2f55 [mlir][nfc] Fix typo in documentation comment 2021-02-25 08:32:14 +01:00
Marius Brehler 699041123e [mlir] Fix emitting attribute documentation
This fixes the documentation emitted for type parameters. Also adds a
missing empty line, rendered as line break in mark down.

Co-authored-by: Simon Camphausen <simon.camphausen@iml.fraunhofer.de>

Reviewed By: rriddle

Differential Revision: https://reviews.llvm.org/D97267
2021-02-25 08:23:50 +01:00
Haojian Wu 77a8589e5d [clang][RecoveryAST] Add design doc to clang internal manual.
Hopefully it would be useful for new developers.

Differential Revision: https://reviews.llvm.org/D96944
2021-02-25 08:22:49 +01:00
Jonas Devlieghere 011a8e218e [debugserver] Fix logic to extract app bundle from file path
Fix the logic to find the app bundle in a path by correctly accounting
for paths containing multiple occurrences of `.app`. The new logic will
correctly extract `com.app.Foo.app` from `com.app.Foo.app/com.app.Foo`.

rdar://74666208

Differential revision: https://reviews.llvm.org/D97441
2021-02-24 23:08:42 -08:00
Pushpinder Singh 99951aa68d OpenMP: Fix object clobbering issue when using save-temps
There are two preconditions to reproduce the issue,
 1. Use -save-temps option
 2. Provide the -o option with name equal to the input file name
    without the file extension. For e.g. clang a.c -o a

With the -o specified, the AssembleJobAction after OffloadWrapperJobAction
will produce the object file with same name as host code object file.
Due to this clash, the OffloadWrapperAction overwrites the initial host
object file, which results in lld error. This also fixes the `multiple definition of __dummy.omp_offloading.entry'` issue in D96769 .

Reviewed By: jdoerfert

Differential Revision: https://reviews.llvm.org/D97273
2021-02-25 00:50:51 -05:00
Craig Topper 159f78fc2f [RISCV] Reuse existing SDLoc and XLenVT in the switch in RISCVISelDAGToDAG::Select. NFC
A SDLoc and XLenVT were already created above the switch.
2021-02-24 21:39:00 -08:00
Lang Hames 93c8246952 [docs][JITLink] Reintroduce JITLink design/API doc with fixes and improvements.
This document was originally introduced in ab4648504b, and was reverted in
912bc4980e while I investigated a number of shpinx bot errors. This commit
reintroduces the document with fixes for those errors, as well as some
improvements to the wording and formatting.
2021-02-25 15:27:59 +11:00
Evgeniy Brevnov 6d31ee1cea [NARY][NFC] New tests for upcoming changes. 2021-02-25 10:52:35 +07:00
Zarko Todorovski 1c051b7b70 [NFC][AIX] Rename aix-csr-vector.ll to aix-csr-vector-extabi.ll 2021-02-24 22:12:01 -05:00
Xun Li c38000a9fb [Coroutine] Check indirect uses of alloca when checking lifetime info
In the existing logic, we look at the lifetime.start marker of each alloca, and check all uses of the alloca, to see if any pair of the lifetime marker and an use of alloca crosses suspension point.
This approach is unfortunately incorrect. An use of alloca does not need to be a direct use, but can be an indirect use through alias.
Only checking direct uses can miss cases where indirect uses are crossing suspension point.
This can be demonstrated in the newly added test case 007.
In the test case, both x and y are only directly used prior to suspend, but they are captured into an alias, merged through a PHINode (so they couldn't be materialized), and used after CoroSuspend.
If we only check whether the lifetime starts cross suspension points with direct uses, we will put the allocas to the stack, and then capture their addresses in the frame.

Instead of fixing it in D96441 and D96566, this patch takes a different approach which I think is better.
We still checks the lifetime info in the same way as before, but with two differences:
1. The collection of liftime.start is moved into AllocaUseVisitor to make the logic more concentrated.
2. When looking at lifetime.start and use pairs, we not only checks the direct uses as before, but in this patch we check all uses collected by AllocaUseVisitor, which would include all indirect uses through alias. This will make the analysis more accurate without throwing away the lifetime optimization.

Differential Revision: https://reviews.llvm.org/D96922
2021-02-24 18:29:23 -08:00
Yang Fan b950de5c13
[docs] Add a release note for the removing of -Wreturn-std-move-in-c++11
`-Wreturn-std-move-in-c++11` has been removed in fbee4a0c79.

Reviewed By: aaronpuchert, amccarth

Differential Revision: https://reviews.llvm.org/D97364
2021-02-25 10:17:09 +08:00
Eric Schweitz 082ec3ab07 [flang][fir][NFC] Remove dead code.
This patch removes OpaqueAttr as it is no longer used.

Differential Revision: https://reviews.llvm.org/D97424
2021-02-24 17:31:23 -08:00
Valentin Clement 841f6995cd [flang][fir][NFC] Move remaining types to TableGen type definition
Move the remaing of FIR types to TableGen type definition. This follow suggestion in D96422.

Reviewed By: schweitz, jeanPerier, rriddle

Differential Revision: https://reviews.llvm.org/D96987
2021-02-24 20:23:31 -05:00
Arthur Eubanks a9b33ffb8f [ThinLTO][NewPM] Clean up dead code under -O0
We're running into undefined references using ThinLTO with -O0 on
Windows/Chrome. This fixes that.

This matches the legacy PM.

Reviewed By: tejohnson

Differential Revision: https://reviews.llvm.org/D97414
2021-02-24 17:08:57 -08:00
Liu, Chen3 4bc7c8631a [X86] Support amx-bf16 intrinsic.
Adding support for intrinsics of AMX-BF16.
This patch alse fix a bug that AMX-INT8 instructions will be selected with wrong
predicate.

Differential Revision: https://reviews.llvm.org/D97358
2021-02-25 09:06:48 +08:00
Greg McGary 151990dd94 [lld-macho] add code signature for native arm64 macOS
Differential Revision: https://reviews.llvm.org/D96164
2021-02-24 17:05:23 -08:00
Fangrui Song e9445765a5 [test] Improve SanitizerCoverage tests on !associated and comdat 2021-02-24 16:51:41 -08:00
Yaxun (Sam) Liu 392fd3f1bf update AMDGPU _Float16 support in clang doc
Reviewed by: Matt Arsenault

Differential Revision: https://reviews.llvm.org/D97386
2021-02-24 19:46:23 -05:00
Jonas Devlieghere b03bb054e1 [llvm] Check availability for os_signpost
Add availability checks to the os_signpost code so this can be used with
an older deployment target.

Differential revision: https://reviews.llvm.org/D97410
2021-02-24 16:27:31 -08:00
David Blaikie 7c926fee93 Improve attribute documentation for nodebug on typedefs
(followup to 8472fa6c54 )
2021-02-24 16:25:37 -08:00
Craig Topper efcdd598b7 [RISCV] Teach VSETVLI inserter to use VSETIVLI when possible.
We always create the VL operand using a register, but if we can
determine that it came from an ADDI X0, imm with a sufficiently
small immediate, we can use VSETIVLI.

Reviewed By: frasercrmck

Differential Revision: https://reviews.llvm.org/D97332
2021-02-24 16:07:33 -08:00
Craig Topper 9bde29629d [RISCV] Use a ComplexPattern for zexti32 to match sexti32.
We just started using a ComplexPattern for sexti32. This updates
zexti32 to match.

Reviewed By: luismarques

Differential Revision: https://reviews.llvm.org/D97231
2021-02-24 16:06:29 -08:00
Yaxun (Sam) Liu 47acdec1dd [CUDA][HIP] Support accessing static device variable in host code for -fgpu-rdc
For -fgpu-rdc mode, static device vars in different TU's may have the same name.
To support accessing file-scope static device variables in host code, we need to give them
a distinct name and external linkage. This can be done by postfixing each static device variable with
a distinct CUID (Compilation Unit ID) hash.

Since the static device variables have different name across compilation units, now we let
them have external linkage so that they can be looked up by the runtime.

Reviewed by: Artem Belevich, and Jon Chesterfield

Differential Revision: https://reviews.llvm.org/D85223
2021-02-24 18:23:45 -05:00
Jing Pu c519460745 Allow !shape.size type operands in "shape.from_extents" op.
This expands the op to support error propagation and also makes it symmetric with  "shape.get_extent" op.

Reviewed By: silvas

Differential Revision: https://reviews.llvm.org/D97261
2021-02-24 14:50:07 -08:00
Vedant Kumar a7d4826101 [profile] Fix buffer overrun when parsing %c in filename string
Fix a buffer overrun that can occur when parsing '%c' at the end of a
filename pattern string.

rdar://74571261

Reviewed By: kastiglione

Differential Revision: https://reviews.llvm.org/D97239
2021-02-24 14:49:45 -08:00
Ryan Prichard 680f836c2f Revert "[builtins] Define fmax and scalbn inline"
This reverts commit 341889ee9e.

The new unit tests fail on sanitizer-windows.
2021-02-24 14:47:48 -08:00
Markus Böck 9f1b832331 Reland "[Driver][Windows] Support per-target runtimes dir layout for profile instr generate"
This relands commit rG7f9d5d6e444c which was reverted in rGab5b00ada9e7

Differential Revision: https://reviews.llvm.org/D96638
2021-02-24 23:40:20 +01:00
Ryan Prichard 341889ee9e [builtins] Define fmax and scalbn inline
Define inline versions of __compiler_rt_fmax* and __compiler_rt_scalbn*
rather than depend on the versions in libm. As with
__compiler_rt_logbn*, these functions are only defined for single,
double, and quad precision (binary128).

Fixes PR32279 for targets using only these FP formats (e.g. Android
on arm/arm64/x86/x86_64).

For single and double precision, on AArch64, use __builtin_fmax[f]
instead of the new inline function, because the builtin expands to the
AArch64 fmaxnm instruction.

Reviewed By: MaskRay

Differential Revision: https://reviews.llvm.org/D91841
2021-02-24 14:27:37 -08:00
Stefan Agner a921aaf789 [MC][ARM] make Thumb function also if type attribute is set
Make sure to set the bottom bit of the symbol even when the type
attribute of a label is set after the label.

GNU as sets the thumb state according to the thumb state of the label.
If a .type directive is placed after the label, set the symbol's thumb
state according to the thumb state of the .type directive. This matches
GNU as in most cases.

From: Stefan Agner <stefan@agner.ch>

This fixes:
https://bugs.llvm.org/show_bug.cgi?id=44860
https://github.com/ClangBuiltLinux/linux/issues/866

Reviewed By: MaskRay

Differential Revision: https://reviews.llvm.org/D74927
2021-02-24 14:08:56 -08:00
Petr Hosek ae7528a34e Revert "[Profile] Include a few asserts in coverage mapping test"
This reverts commit 80f329bcd0.
2021-02-24 14:01:42 -08:00