Commit Graph

437777 Commits

Author SHA1 Message Date
Louis Dionne 258477ed0a [llvm] Remove libcxx, libcxxabi and libunwind from supported LLVM_ENABLE_PROJECTS
This is a breaking change. If you were passing one of those three runtimes
in LLVM_ENABLE_PROJECTS, you need to start passing them in LLVM_ENABLE_RUNTIMES
instead. The runtimes in LLVM_ENABLE_RUNTIMES will start being built using
the "bootstrapping build" instead, which means that they will be built
using the just-built Clang. This is usually what you wanted anyway.

If you were using LLVM_ENABLE_PROJECTS=all with the explicit goal of
building these three runtimes, you can now use LLVM_ENABLE_RUNTIMES=all
and these runtimes will be built using the bootstrapping build.

NOTE: This is a re-application of 887b8bd733 which had been reverted
      in 6b03a4fea0 because it broke the Sphinx documentation publishers.
      The Sphinx documentation publishers have now been moved to using
      the runtimes build, so this should not be an issue anymore.

Differential Revision: https://reviews.llvm.org/D132480
2022-10-04 09:04:12 -04:00
Denys Shabalin e3fd612e99 [mlir] Add fully dynamic constructor to StridedLayoutAttr bindings
Reviewed By: ftynse

Differential Revision: https://reviews.llvm.org/D135139
2022-10-04 13:02:55 +00:00
Uday Bondhugula ddff3766b7 [MLIR] Simplify affine maps + operands exploiting IV info
Simplify affine expressions and maps while exploiting simple range and
step info of any IVs that are operands. This simplification is local,
O(1) and practically useful in several scenarios. Accesses with
floordiv's and mod's where the LHS is non-negative and bounded or is a
known multiple of a constant can often be simplified. This is
implemented as a canonicalization for all affine ops in a generic way:
all affine.load/store, vector_load/store, affine.apply, affine.min/max,
etc. ops.

Eg: For tiled loop nests accessing buffers this way:

affine.for %i = 0 to 1024 step 32 {
  affine.for %ii = 0 to 32 {
    affine.load [(%i + %ii) floordiv 32, (%i + %ii) mod 32]
  }
}

// Note that %i is a multiple of 32 and %ii < 32, hence:

(%i + %ii) floordiv 32 is the same as %i floordiv 32
(%i + %ii) mod 32 is the same as %ii mod 32.

The simplification leads to simpler index/subscript arithmetic for
multi-dimensional arrays and also in turn enables detection of spatial
locality (for vectorization for eg.), temporal locality or loop
invariance for hoisting or scalar replacement.

Differential Revision: https://reviews.llvm.org/D135085
2022-10-04 18:18:34 +05:30
Thomas Symalla 82cac65dd2 [NFC][AMDGPU] Pre-commit test for D134418. 2022-10-04 14:30:56 +02:00
Adrian Kuegel b8b5165f67 [mlir] Apply ClangTidy performance finding.
loop variable is copied but only used as const reference.
2022-10-04 14:07:39 +02:00
Alex Zinenko 3dfea727a4 [mlir] relax transform dialect multi-handle restriction
Relax the restriction in the transform dialect interpreter utilities
that expected a payload IR op to be assocaited with at most one
transform IR handle value. This was useful during the initial
bootstrapping to avoid use-after-free error equivalents when a payload
IR op could be erased through one of the handles associated with it and
then accessed through another. It was, however, possible to erase an
ancestor of the payload IR operation in question. The expensive-checks
mode of interpretation is able to detect both cases and has proven
sufficiently robust in debugging use-after-free errors.

Reviewed By: springerm

Differential Revision: https://reviews.llvm.org/D134964
2022-10-04 11:57:49 +00:00
Guray Ozen 89bb0cae46 [mlir][transform] Create GPU transform dialect
This revision adds GPU transform dialect. It also introduce a prefix such as "transform.gpu" for all ops related to this dialect.

MLIR already had two GPU transform op in linalg. This revision moves these ops into GPUTransformOps. The Ops are as follows:

`transform.structured.map_nested_foreach_thread_to_gpu_blocks`  -> `transform.gpu.map_foreach_to_blocks`
This op selects the outermost (toplevel) foreach_thread and parallelize across GPU blocks. It can also generate `gpu_launch`.

`transform.structured.map_nested_foreach_thread_to_gpu_threads` -> `transform.gpu.map_nested_foreach_to_threads`
This op parallelizes nested foreach_thread that are inside `gpu_launch` across GPU threads.

It doesn't add new functionality, but there are some minor refactoring of the code.

Reviewed By: ftynse

Differential Revision: https://reviews.llvm.org/D134800
2022-10-04 13:09:08 +02:00
Bjorn Pettersson 491ac8f3e8 [LibCalls] Cast Char argument to 'int' before calling emitFPutC
The helpers in BuildLibCalls normally expect that the Value
arguments already have the correct type (matching the lib call
signature). And exception has been emitFPutC which casted the Char
argument to 'int' using CreateIntCast. This patch moves the cast to
the caller instead of doing it inside emitFPutC.

I think it makes sense to make the BuildLibCall API:s a bit
more consistent this way, despite the need to handle the int cast
in two different places now.

Differential Revision: https://reviews.llvm.org/D135066
2022-10-04 12:52:05 +02:00
Bjorn Pettersson aa1b64cc42 [BuildLibCalls] Use TLI to get 'int' and 'size_t' type sizes
Stop assuming that an 'int' is 32 bits in helpers that emit libcalls
to lib functions that had 'int' in the signature. For most targets
this is NFC. For a target with 16 bit 'int' type this could help out
detecting if trying to emit a libcall with incorrect signature.

Similarly we now derive the type mapping to 'size_t' by asking TLI
about the size of 'size_t'. This should be NFC (at least for in-tree
targets) since getSizeTSize(), in TLI, is deriving the size in the
same way as DataLayout::getIntPtrType().

Differential Revision: https://reviews.llvm.org/D135065
2022-10-04 12:52:05 +02:00
Bjorn Pettersson 73e8d95d28 [BuildLibCalls] Name types to identify when 'int' and 'size_t' is assumed. NFC
Lots of BuildLibCalls helpers are using Builder::getInt32Ty to get
a type matching an 'int', and DataLayout::getIntPtrType to get a
type matching 'size_t'. The former is not true for all targets, since
and 'int' isn't always 32 bits. And the latter is a bit weird as well
as the definition of DataLayout::getIntPtrType isn't clearly mapping
it to 'size_t'.

This patch is not aiming at solving any such problems. It is merely
highlighting when a libcall is expecting to use 'int' and 'size_t'
by naming the types as IntTy and SizeTTy when preparing the type
signatures for the emitted libcalls.

Differential Revision: https://reviews.llvm.org/D135064
2022-10-04 12:52:05 +02:00
Florian Hahn 825e16969e
[LAA] Pass LoopAccessInfoManager instead of GetLAA function.
Use LoopAccessInfoManager directly instead of various GetLAA lambdas.

Depends on D134608.

Reviewed By: aeubanks

Differential Revision: https://reviews.llvm.org/D134609
2022-10-04 11:51:25 +01:00
Amara Emerson 75b18ba14d Revert "[AArch64][GlobalISel] Fold away lowered vector sign-extend of vector compares."
This reverts commit dcd02a524b.

We should instead use the generic combine.
2022-10-04 11:03:02 +01:00
Nikita Popov 6e504d637d [ValueTracking] Handle constant exprs in isKnownNonZero()
Handle constant expressions by falling through to the general
operator-based code. In particular, this adds support for bitcast
and GEP expressions.
2022-10-04 11:58:07 +02:00
Daniel Bertalan 0d30e92f59
[lld-macho] Add support for emitting chained fixups
This commit adds support for chained fixups, which were introduced in
Apple's late 2020 OS releases. This format replaces the dyld opcodes
used for supplying rebase and binding information, and encodes most of
that data directly in the memory location that will have the fixup
applied.

This reduces binary size and is a requirement for page-in linking, which
will be available starting with macOS 13.

A high-level overview of the format and my implementation can be found
in SyntheticSections.h.

This feature is currently gated behind the `-fixup_chains` flag, and
will be enabled by default for supported targets in a later commit.

Like in ld64, lazy binding is disabled when chained fixups are in use,
and the `-init_offsets` transformation is performed by default.

Differential Revision: https://reviews.llvm.org/D132560
2022-10-04 11:48:45 +02:00
bipmis 8344dfab59 Add reverse load pattern tests 2022-10-04 10:39:41 +01:00
Florian Hahn e399dd601f
[SimpleLoopUnswitch] Clear block and loop dispos after destroying loop.
SimpleLoopUnswitch may remove loops. Clear block and loop dispositions,
to clean up invalid entries in the cache.

Fixes #58136.
2022-10-04 10:27:52 +01:00
Nikita Popov 635f93dff7 [SimplifyLibCalls] Place deref attr even if nonnull already set
If nonnull is already set, we currently skip setting both nonnull
and dereferenceable. Make these independent, to avoid regressions
when additional nonnull attributes are inferred earlier.
2022-10-04 11:26:15 +02:00
Nikita Popov 0f32f0e147 Revert "[InstCombine] Switch foldOpIntoPhi() to use InstSimplify"
This reverts commit b20e34b39f.

This causes RAUW type mismatch assertions on some buildbots,
reverting for now.
2022-10-04 11:17:09 +02:00
Nikita Popov 45dec8f5fd [ValueTracking] Avoid known bits fallthrough for freeze (NFCI)
The known bits logic should never produce a better result than
the direct recursive non-zero query here, so skip the fallthrough.
2022-10-04 11:02:31 +02:00
Nikita Popov 9c0314f54e [ValueTracking] Switch isKnownNonZero() to switch over opcodes (NFCI)
The change in the assume-queries-counter.ll test is because we skip
and unnecessary known bits query for arguments.
2022-10-04 10:54:28 +02:00
Matthias Springer 81ca5aa452 [mlir][tensor][NFC] Rename linalg.init_tensor to tensor.empty
tensor.empty/linalg.init_tensor produces an uninititalized tensor that can be used as a destination operand for destination-style ops (ops that implement `DestinationStyleOpInterface`).

This change makes it possible to implement `TilingInterface` for non-destination-style ops without depending on the Linalg dialect.

RFC: https://discourse.llvm.org/t/rfc-add-tensor-from-shape-operation/65101

Differential Revision: https://reviews.llvm.org/D135129
2022-10-04 17:25:35 +09:00
Nikita Popov b20e34b39f [InstCombine] Switch foldOpIntoPhi() to use InstSimplify
foldOpIntoPhi() currently only folds operations into the phi if all
but one operands constant-fold. The two exceptions to this are freeze
and select, where we allow more general simplification.

This patch makes foldOpIntoPhi() generally simplification based and
removes all the instruction-specific logic. We just try to simplify
the instruction for each operand, and for the (potentially) one
non-simplified operand, we move it into the new block with adjusted
operands.

This fixes https://github.com/llvm/llvm-project/issues/57448, which
was my original motivation for the change.
2022-10-04 10:12:14 +02:00
Valentin Clement 9d99b482cd
[flang] Lower polymorphic entities types in dummy argument and function result
This patch updates lowering to produce the correct fir.class types for
various polymorphic and unlimited polymoprhic entities cases. This is only the
lowering. Some TODOs have been added to the CodeGen part to avoid errors since
this part still need to be updated as well.
The fir.class<*> representation for unlimited polymorphic entities mentioned in
the document has been updated to fir.class<none> to avoid useless work in pretty
parse/printer.

This patch is part of the implementation of the poltymorphic
entities.
https://github.com/llvm/llvm-project/blob/main/flang/docs/PolymorphicEntities.md

Depends on D134957

Reviewed By: jeanPerier

Differential Revision: https://reviews.llvm.org/D134959
2022-10-04 09:43:59 +02:00
Florian Hahn db720dc17c
[LAA] Use LoopAccessInfoManager in legacy pass.
Simplify LoopAccessLegacyAnalysis by using LoopAccessInfoManager from
D134606. As a side-effect this also removes printing support from
LoopAccessLegacyAnalysis.

Depends on D134606.

Reviewed By: aeubanks

Differential Revision: https://reviews.llvm.org/D134608
2022-10-04 08:37:11 +01:00
LLVM GN Syncbot 1303abe658 [gn build] Port 6d9eb53329 2022-10-04 07:16:10 +00:00
Balázs Kéri 6d9eb53329 [clang-tidy] Add checker 'bugprone-suspicious-realloc-usage'.
Add a check to detect usages of `realloc` where the result is assigned
to the same variable (or field) as passed to the first argument.

Reviewed By: steakhal, martong

Differential Revision: https://reviews.llvm.org/D133119
2022-10-04 09:14:46 +02:00
Nicolas Vasilache 46869eebdc [mlir][Memref] NFC - Addresult pretty printing to MemrefOps
Differential Revision: https://reviews.llvm.org/D134968
2022-10-04 00:05:16 -07:00
Lang Hames 7ec6dde83a [llvm-jitlink] Teach InProcessDeltaMapper to honor -slab-page-size option.
The -slab-page-size option is used to set a simulated page size in -no-exec
tests. In order for this to work we need to use read/write permissions only
on all simulated pages in order to ensure that no simulated page is made
read-only by a permission change to the underlying real page.

The aim of this patch is to make it safe to enable ExecutionEngine regression
tests on arm64. Those tests will be enabled in a follow-up patch.
2022-10-03 21:50:01 -07:00
Lang Hames 3019f488f4 [ORC] Don't unnecessarily copy collection element. 2022-10-03 21:50:01 -07:00
Craig Topper 05df15965b [RISCV] Use _TIED form of VFWADD(U)_WV/VFWSUB(U)_WV to avoid early clobber.
One of the sources is the same size as the destination so that source
doesn't have an overlap with the destination register. By using the _TIED
form we avoid an early clobber contraint for that source.

This matches what was already done for instrinsics. ConvertToThreeAddress
will fix it if it can't stay tied.
2022-10-03 21:44:08 -07:00
Craig Topper b41fe90dc3 [RISCV] Correct the setcc in vp.floor/ceil/round/roundeven lowering.
We want to emit a masked setcc that preserves zeros in all of the bits
where the original mask is zero. To do this we need to pass the original
mask as the passthru operand as well. Otherwise, we'll use the mask agnostic
policy and replace the zeros with 1s on some CPUs.

Differential Revision: https://reviews.llvm.org/D135122
2022-10-03 20:58:05 -07:00
Lang Hames ff85a1879c [ORC] Fix typo in 543790add8. 2022-10-03 20:43:48 -07:00
Lang Hames 516397e144 [ORC] More attempts to fix Windows bots after d3d9f7caf9.
Move getWindowsProtectionFlags inside namespace to make MemProt type accessible.
2022-10-03 20:31:31 -07:00
Lang Hames 543790add8 [ORC] Attempt to fix Windows bots after d3d9f7caf9.
That patch failed to include an update to the Windows side of
ExecutorSharedMemoryMapperService.
2022-10-03 20:15:58 -07:00
LLVM GN Syncbot 3688102aab [gn build] Port d3d9f7caf9 2022-10-04 02:36:02 +00:00
Lang Hames d3d9f7caf9 [ORC][JITLink] Move MemoryFlags.h (MemProt, AllocGroup,...) from JITLink to ORC.
Moving these types to OrcShared eliminates the need for the separate
WireProtectionFlags type.
2022-10-03 19:35:34 -07:00
changkaiyan bd561ca66b [bug] The additional patch committed file was deleted.
Differential Revision: https://reviews.llvm.org/D134696

	deleted:    202209301111.patch
2022-10-04 10:19:59 +08:00
Jez Ng 6ffdb67b53 [MC][test] Update arm64-leaf-compact-unwind.s to use llvm-objdump
This addresses the long-standing FIXME in the test. I would like to
update the test, and objdump's output is a lot more readable / editable
than readobj's.

Reviewed By: MaskRay

Differential Revision: https://reviews.llvm.org/D134690
2022-10-03 22:16:40 -04:00
Jakub Kuderski 247c84aef6 [mlir] Reduce call stack depth in LogicalResult. NFC.
When debuging a crash or conversion failure in a deep pass pipeline,
there are often many interleaved frames with `failed` and `succeeded`.
`LogicalResult` is used through the pass infrastructure, so by not implementing
failure in terms of a call to succeess, this patch noticeably reduces the total
total call stack depth and improves the debugging experience.

Reviewed By: rriddle

Differential Revision: https://reviews.llvm.org/D135116
2022-10-03 21:59:02 -04:00
Jordan Rupprecht de471fee27 [bazel] port d033ece0c9 2022-10-03 18:49:14 -07:00
changkaiyan c4cc755c72 [mlir][mlir-translation] patch for standalone-translation command line description missing.
Differential Revision: https://reviews.llvm.org/D134696

	modified:   mlir/examples/standalone/standalone-translate/standalone-translate.cpp
	modified:   mlir/include/mlir/Tools/mlir-translate/Translation.h
	modified:   mlir/lib/Target/Cpp/TranslateRegistration.cpp
	modified:   mlir/lib/Target/LLVMIR/ConvertFromLLVMIR.cpp
	modified:   mlir/lib/Target/LLVMIR/ConvertToLLVMIR.cpp
	modified:   mlir/lib/Target/SPIRV/TranslateRegistration.cpp
	modified:   mlir/lib/Tools/mlir-translate/Translation.cpp
2022-10-04 09:14:40 +08:00
Jim Ingham 852a4bdb25 Change the Sanitizer report breakpoint callbacks to asynchronous.
The synchronous callbacks are not intended to start the target running
during the callback, and doing so is flakey.  This patch converts them
to being regular async callbacks, and adds some testing for sequential
reports that have caused problems in the field.

Differential Revision: https://reviews.llvm.org/D134927
2022-10-03 18:10:28 -07:00
Nemanja Ivanovic 4ea121c904 [PowerPC] Fix a number of inefficiencies and issues with atomic code gen
There are a few issues with the code we generate for atomic operations and the way we generate it:

- Hard coded CR0 for compares
- Order of operands for compares not conducive to
  emitting compare-immediate or for CSE of compares
- Missing MachineMemOperand for st[bhwd]cx intrinsics
- Missing intrinsic properties for the same
- Unnecessary blocks with store conditional
  instructions to clear reservation (which ends
  up hindering performance)
- Move from CR instructions just to compare the
  result of a store conditional with zero (even
  though it is a record-form)

This patch aims to resolve all of those issues.

Differential revision: https://reviews.llvm.org/D134783
2022-10-03 19:55:29 -05:00
Victor Michel 9cf60d8479 [llvm-gsymutil] Fix tracking of currently open file
Prior to this change, `CurrentGSYMPath` was never updated. As a consequence, the GSYM file was reopened for every frame, even if all frames were relative to the same GSYM file.

This change brings a 13x speedup on a test I'm doing (symbolizing ~25K frames from libxul)

(This is my first-ever LLVM change - sorry if I missed something in the process!)

Reviewed By: simon.giesecke, clayborg

Differential Revision: https://reviews.llvm.org/D132912
2022-10-03 17:49:12 -07:00
Sam Clegg 0a9756fc15 [lld][WebAssemlby] Improve support for -L / -l and add testing
- Add support -Bdynamic/-Bstatic and their aliases
- Add support for `--library` and `--library-path` long form args
- Add test based on test/ELF/libsearch.s
- In `-Bdynamic` mode search for `.so` files in preference to `.a`.
- Unlike ELF continue to default to static mode until `-pie` or
  `-shared` are used.

Differential Revision: https://reviews.llvm.org/D135087
2022-10-03 16:53:30 -07:00
Jeff Niu d67def8704 [mlir][analysis] Remove empty files (NFC) 2022-10-03 16:52:53 -07:00
Nico Weber fb5a63e9af [gn build] port d033ece0c9 for now 2022-10-03 19:50:21 -04:00
Yuanfang Chen 1fb728e95c [c++] implements tentative DR1432 for partial ordering of function template
D128745 handled DR1432 for the partial ordering of partial specializations, but
missed the handling for the partial ordering of function templates. This patch
implements the latter. While at it, also simplifies the previous implementation to
be more close to the wording without functional changes.

Fixes https://github.com/llvm/llvm-project/issues/56090

Reviewed By: erichkeane, #clang-language-wg, mizvekov

Differential Revision: https://reviews.llvm.org/D133683
2022-10-03 16:30:27 -07:00
Jessica Paquette c7652dbed4 NFC: Fix legalizer-info-validation again
Also fix some lines that should have been DEBUG-NEXT, which made it a bit harder to
see what was happening here.
2022-10-03 16:24:12 -07:00
Michael Holman 513f89dc8d Add functionality to load dynamic libraries temporarily
Previously, it was possible to load dynamic libraries which would be unloaded on llvm_shutdown(), but recently ManagedStatic removal changed this so that loaded libraries really can't ever be unloaded. This functionality was very useful, and so to add it back in a more explicit way, I've added new getLibrary() and closeLibrary() methods to allow callers to use the very convenient platform independent abstraction that LLVM has for dynamic libraries.

As a specific use case, the onnx-mlir project was using this functionality with an API that allows instancing LLVM so you can compile a shared library, and then load that library, and eventually close the instance (and library) and compile something else. This change to llvm_shutdown causes libraries to leak and also locks the libraries for the entire duration of the program which prevents reusing library names.

Reviewed By: lhames

Differential Revision: https://reviews.llvm.org/D134763
2022-10-03 16:20:22 -07:00