Intrinsic properties can now be set to default and applied to all
intrinsics. If the attributes are not needed, the user can opt-out by
setting the DisableDefaultAttributes flag to true.
Differential Revision: https://reviews.llvm.org/D70365
This implements the assemble and disassemble support of RISCV Vector
extension Zvlsseg instructions, base on the 0.9 spec version.
Reviewed by HsiangKai
Differential Revision: https://reviews.llvm.org/D84416
Regarding this bug in Bugzilla: https://bugs.llvm.org/show_bug.cgi?id=46451
I went ahead and fixed the regexp pattern and now Python script is able
to extract vplan graphs from the log files. Additionally some test for
this would be nice to have but I'm not sure are Python scripts tested
in LLVM and if so where they live.
Reviewed By: fhahn
Differential Revision: https://reviews.llvm.org/D86068
This patch fixes a bug which skipped
adding predicate matcher for a pattern in many cases.
For example, if predicate is Load and
its memoryVT is non-null then the loop
continues and never reaches to the end which
adds the predicate matcher. This patch moves the
matcher addition to the top of the loop
so that it gets added regardless of contextual checks
later in the loop.
Other way to fix this issue is to remove all "continue" statements
in checks and let the loop continue till end.
Reviewed By: arsenm
Differential Revision: https://reviews.llvm.org/D83034
In our discussion D79699 SVE ptrace register access support we decide to
invalidate register context cached data on every stop instead of doing
at before Step/Resume.
InvalidateAllRegisters was added to facilitate flushing of SVE register
context configuration and cached register values. It now makes more
sense to move invalidation after every stop where we initiate SVE
configuration update if needed by calling ConfigureRegisterContext.
Reviewed By: labath
Differential Revision: https://reviews.llvm.org/D84501
This patch updates LLDB's in house version of SVE ptrace/sig macros by
converting them into constants and inlines. They are housed under sve
namespace and are used by process elf-core for reading SVE register data.
Reviewed By: labath
Differential Revision: https://reviews.llvm.org/D85641
In DAGTypeLegalizer::GenWidenVectorLoads the algorithm assumes it only
ever deals with fixed width types, hence the offsets for each individual
store never take 'vscale' into account. I've changed the code in that
function to use TypeSize instead of unsigned for tracking the remaining
load amount. In addition, I've changed the load loop to use the new
IncrementPointer helper function for updating the addresses in each
iteration, since this handles scalable vector types.
Also, I've added report_fatal_errors in GenWidenVectorExtLoads,
TargetLowering::scalarizeVectorLoad and TargetLowering::scalarizeVectorStores,
since these functions currently use a sequence of element-by-element
scalar loads/stores. In a similar vein, I've also added a fatal error
report in FindMemType for the case when we decide to return the element
type for a scalable vector type.
I've added new tests in
CodeGen/AArch64/sve-split-load.ll
CodeGen/AArch64/sve-ld-addressing-mode-reg-imm.ll
for the changes in GenWidenVectorLoads.
Differential Revision: https://reviews.llvm.org/D85909
mtune was previously ignored by the compiler so I'm not sure this
did anything. But after D85384 we're starting to support mtune
and this code is now causing a couple test failures on MacOS.
`dispatch_async_and_wait()` was introduced in macOS 10.14. Let's
forward declare it to ensure we can compile the test with older SDKs and
guard execution by checking if the symbol is available. (We can't use
`__builtin_available()`, because that itself requires a higher minimum
deployment target.) We also need to specify the `-undefined
dynamic_lookup` compiler flag.
Differential Revision: https://reviews.llvm.org/D85995
`dispatch_async_and_wait()` was introduced in macOS 10.14, which is
greater than our minimal deployment target. We need to forward declare
it as a "weak import" to ensure we generate a weak reference so the TSan
dylib continues to work on older systems. We cannot simply `#include
<dispatch.h>` or use the Darwin availability macros since this file is
multi-platform.
In addition, we want to prevent building these interceptors at all when
building with older SDKs because linking always fails.
Before:
```
➤ dyldinfo -bind ./lib/clang/12.0.0/lib/darwin/libclang_rt.tsan_osx_dynamic.dylib | grep dispatch_async_and_wait
__DATA __interpose 0x000F5E68 pointer 0 libSystem _dispatch_async_and_wait_f
```
After:
```
➤ dyldinfo -bind ./lib/clang/12.0.0/lib/darwin/libclang_rt.tsan_osx_dynamic.dylib | grep dispatch_async_and_wait
__DATA __got 0x000EC0A8 pointer 0 libSystem _dispatch_async_and_wait (weak import)
__DATA __interpose 0x000F5E78 pointer 0 libSystem _dispatch_async_and_wait (weak import)
```
This is a follow-up to D85854 and should fix:
https://reviews.llvm.org/D85854#2221529
Reviewed By: kubamracek
Differential Revision: https://reviews.llvm.org/D86103
The linker errors caused by this revision have been addressed.
Add interceptors for `dispatch_async_and_wait[_f]()` which was added in
macOS 10.14. This pair of functions is similar to `dispatch_sync()`,
but does not force a context switch of the queue onto the caller thread
when the queue is active (and hence is more efficient). For TSan, we
can apply the same semantics as for `dispatch_sync()`.
From the header docs:
> Differences with dispatch_sync()
>
> When the runtime has brought up a thread to invoke the asynchronous
> workitems already submitted to the specified queue, that servicing
> thread will also be used to execute synchronous work submitted to the
> queue with dispatch_async_and_wait().
>
> However, if the runtime has not brought up a thread to service the
> specified queue (because it has no workitems enqueued, or only
> synchronous workitems), then dispatch_async_and_wait() will invoke the
> workitem on the calling thread, similar to the behaviour of functions
> in the dispatch_sync family.
Additional context:
> The guidance is to use `dispatch_async_and_wait()` instead of
> `dispatch_sync()` when it is necessary to mix async and sync calls on
> the same queue. `dispatch_async_and_wait()` does not guarantee
> execution on the caller thread which allows to reduce context switches
> when the target queue is active.
> https://gist.github.com/tclementdev/6af616354912b0347cdf6db159c37057
rdar://35757961
Reviewed By: kubamracek
Differential Revision: https://reviews.llvm.org/D85854
This changes the behavior of constructing MLIRContext to no longer load globally
registered dialects on construction. Instead Dialects are only loaded explicitly
on demand:
- the Parser is lazily loading Dialects in the context as it encounters them
during parsing. This is the only purpose for registering dialects and not load
them in the context.
- Passes are expected to declare the dialects they will create entity from
(Operations, Attributes, or Types), and the PassManager is loading Dialects into
the Context when starting a pipeline.
This changes simplifies the configuration of the registration: a compiler only
need to load the dialect for the IR it will emit, and the optimizer is
self-contained and load the required Dialects. For example in the Toy tutorial,
the compiler only needs to load the Toy dialect in the Context, all the others
(linalg, affine, std, LLVM, ...) are automatically loaded depending on the
optimization pipeline enabled.
To adjust to this change, stop using the existing dialect registration: the
global registry will be removed soon.
1) For passes, you need to override the method:
virtual void getDependentDialects(DialectRegistry ®istry) const {}
and registery on the provided registry any dialect that this pass can produce.
Passes defined in TableGen can provide this list in the dependentDialects list
field.
2) For dialects, on construction you can register dependent dialects using the
provided MLIRContext: `context.getOrLoadDialect<DialectName>()`
This is useful if a dialect may canonicalize or have interfaces involving
another dialect.
3) For loading IR, dialect that can be in the input file must be explicitly
registered with the context. `MlirOptMain()` is taking an explicit registry for
this purpose. See how the standalone-opt.cpp example is setup:
mlir::DialectRegistry registry;
registry.insert<mlir::standalone::StandaloneDialect>();
registry.insert<mlir::StandardOpsDialect>();
Only operations from these two dialects can be in the input file. To include all
of the dialects in MLIR Core, you can populate the registry this way:
mlir::registerAllDialects(registry);
4) For `mlir-translate` callback, as well as frontend, Dialects can be loaded in
the context before emitting the IR: context.getOrLoadDialect<ToyDialect>()
Differential Revision: https://reviews.llvm.org/D85622
The documentation needs a refresh now that "kinds" are no longer a concept. This revision also adds mentions to a few other new concepts, e.g. traits and interfaces.
Differential Revision: https://reviews.llvm.org/D86182
Summary:
When the resource descriptor is of vgpr, we need a waterfall loop
to read into a sgpr. In this patchm we generalized the implementation
to work for any regster class sizes, and extend the work to MIMG
instructions.
Fixes: SWDEV-223405
Reviewers:
arsenm, nhaehnle
Differential Revision:
https://reviews.llvm.org/D82603
This changes the behavior of constructing MLIRContext to no longer load globally
registered dialects on construction. Instead Dialects are only loaded explicitly
on demand:
- the Parser is lazily loading Dialects in the context as it encounters them
during parsing. This is the only purpose for registering dialects and not load
them in the context.
- Passes are expected to declare the dialects they will create entity from
(Operations, Attributes, or Types), and the PassManager is loading Dialects into
the Context when starting a pipeline.
This changes simplifies the configuration of the registration: a compiler only
need to load the dialect for the IR it will emit, and the optimizer is
self-contained and load the required Dialects. For example in the Toy tutorial,
the compiler only needs to load the Toy dialect in the Context, all the others
(linalg, affine, std, LLVM, ...) are automatically loaded depending on the
optimization pipeline enabled.
To adjust to this change, stop using the existing dialect registration: the
global registry will be removed soon.
1) For passes, you need to override the method:
virtual void getDependentDialects(DialectRegistry ®istry) const {}
and registery on the provided registry any dialect that this pass can produce.
Passes defined in TableGen can provide this list in the dependentDialects list
field.
2) For dialects, on construction you can register dependent dialects using the
provided MLIRContext: `context.getOrLoadDialect<DialectName>()`
This is useful if a dialect may canonicalize or have interfaces involving
another dialect.
3) For loading IR, dialect that can be in the input file must be explicitly
registered with the context. `MlirOptMain()` is taking an explicit registry for
this purpose. See how the standalone-opt.cpp example is setup:
mlir::DialectRegistry registry;
registry.insert<mlir::standalone::StandaloneDialect>();
registry.insert<mlir::StandardOpsDialect>();
Only operations from these two dialects can be in the input file. To include all
of the dialects in MLIR Core, you can populate the registry this way:
mlir::registerAllDialects(registry);
4) For `mlir-translate` callback, as well as frontend, Dialects can be loaded in
the context before emitting the IR: context.getOrLoadDialect<ToyDialect>()
Differential Revision: https://reviews.llvm.org/D85622
StackLifetime class collects lifetime marker of an `alloca` by collect
the user of `BitCast` who is the user of the `alloca`. However, either
the `alloca` itself could be used with the lifetime marker or the `BitCast`
of the `alloca` could be transformed to other instructions. (e.g.,
it may be transformed to all zero reps in `InstCombine` pass).
This patch tries to fix this process in `collectMarkers` functions.
Reviewed By: vitalybuka
Differential Revision: https://reviews.llvm.org/D85399
This greatly simplifies a large portion of the underlying infrastructure, allows for lookups of singleton classes to be much more efficient and always thread-safe(no locking). As a result of this, the dialect symbol registry has been removed as it is no longer necessary.
For users broken by this change, an alert was sent out(https://llvm.discourse.group/t/removing-kinds-from-attributes-and-types) that helps prevent a majority of the breakage surface area. All that should be necessary, if the advice in that alert was followed, is removing the kind passed to the ::get methods.
Differential Revision: https://reviews.llvm.org/D86121
Summary:
Caught by HWASAN on arm64 Android (which uses ld128 for long double). This
was running the existing fuzzer.
The specific minimized fuzz input to reproduce this is:
__cxa_demangle("1\006ILeeeEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEE", 0, 0, 0);
Reviewers: eugenis, srhines, #libc_abi!
Subscribers: kristof.beyls, danielkiss, libcxx-commits
Tags: #libc_abi
Differential Revision: https://reviews.llvm.org/D77924
Building on the backend support from D85165. This parses the command line option in the driver, passes it on to CC1 and adds a function attribute.
-Still need to support tune on the target attribute.
-Need to use "generic" as the tuning by default. But need to change generic in the backend first.
-Need to set tune if march is specified and mtune isn't.
-May need to disable getHostCPUName's ability to guess CPU name from features when it doesn't have a family/model match for mtune=native. That's what gcc appears to do.
Differential Revision: https://reviews.llvm.org/D85384
Now that we no longer require for this map to have stable iteration order,
we no longer need to pay for keeping the iteration order stable,
so switch from `SmallMapVector` to `SmallDenseMap`.
Summary:
Make exactly single NodeBuilder exists at any given time
Reviewers: NoQ, Szelethus, vsavchenko, xazax.hun
Reviewed By: NoQ
Subscribers: martong, cfe-commits
Tags: #clang
Differential Revision: https://reviews.llvm.org/D85796
While it may seem like we can just "deduplicate" the case where
some basic block happens to be a predecessor more than once,
which happens for e.g. switches, that is not correct thing to do.
We must actually add a PHI operand for each predecessor.
This was initially reported to me by David Major
as a clang crash during gecko build for android.
Although it works fine with glibc, as currently implemented the
frameheader cache is incompatible with certain platforms with
slightly different locking semantics inside dl_iterate_phdr.
Therefore only enable it when it is turned on explicitly with
a configure-time option.
Differential Revision: https://reviews.llvm.org/D86163
These instructions weren't in the initial version of MMX, but
were added when SSE1 was introduced. We already have the intrinsic
named correctly to include sse and the frontened header enforces
sse. We have one place in the backend where we DAG combine to
this intrinsic, but that's also qualified. So don't know of anything
currently broken unless someone writes their own IR and doesn't
set the sse feature.
This changes the behavior of constructing MLIRContext to no longer load globally
registered dialects on construction. Instead Dialects are only loaded explicitly
on demand:
- the Parser is lazily loading Dialects in the context as it encounters them
during parsing. This is the only purpose for registering dialects and not load
them in the context.
- Passes are expected to declare the dialects they will create entity from
(Operations, Attributes, or Types), and the PassManager is loading Dialects into
the Context when starting a pipeline.
This changes simplifies the configuration of the registration: a compiler only
need to load the dialect for the IR it will emit, and the optimizer is
self-contained and load the required Dialects. For example in the Toy tutorial,
the compiler only needs to load the Toy dialect in the Context, all the others
(linalg, affine, std, LLVM, ...) are automatically loaded depending on the
optimization pipeline enabled.
To adjust to this change, stop using the existing dialect registration: the
global registry will be removed soon.
1) For passes, you need to override the method:
virtual void getDependentDialects(DialectRegistry ®istry) const {}
and registery on the provided registry any dialect that this pass can produce.
Passes defined in TableGen can provide this list in the dependentDialects list
field.
2) For dialects, on construction you can register dependent dialects using the
provided MLIRContext: `context.getOrLoadDialect<DialectName>()`
This is useful if a dialect may canonicalize or have interfaces involving
another dialect.
3) For loading IR, dialect that can be in the input file must be explicitly
registered with the context. `MlirOptMain()` is taking an explicit registry for
this purpose. See how the standalone-opt.cpp example is setup:
mlir::DialectRegistry registry;
mlir::registerDialect<mlir::standalone::StandaloneDialect>();
mlir::registerDialect<mlir::StandardOpsDialect>();
Only operations from these two dialects can be in the input file. To include all
of the dialects in MLIR Core, you can populate the registry this way:
mlir::registerAllDialects(registry);
4) For `mlir-translate` callback, as well as frontend, Dialects can be loaded in
the context before emitting the IR: context.getOrLoadDialect<ToyDialect>()
LinalgDistribution options to allow more general distributions.
Changing the signature of the callback to send in the ranges for all
the parallel loops and expect a vector with the Value to use for the
processor-id and number-of-processors for each of the parallel loops.
Differential Revision: https://reviews.llvm.org/D86095
Checking if an object file is in memory should use the ObjectFile::IsInMemory(), not test ObjectFile::BaseAddress(). ObjectFile::BaseAddress() is designed to be overridden by all classes and is for mach-o, ELF and COFF plug-ins. They find the header base adddress and return that as a section offset address. The default implementation of ObjectFile::BaseAddress() does try and make an Address() from the ObjectFile::m_memory_addr, but I switched it to a correct function call.
Differential Revision: https://reviews.llvm.org/D86122
We probably want to introduce pseudo-instructions at some point, like
we have for binary operations, but this seems okay for now.
One thing I'm not sure about is whether we should be doing this as a
DAGCombine instead of directly pattern-matching it. I don't see any big
downside to doing it this way, though.
Differential Revision: https://reviews.llvm.org/D85681
This isn't necessaary for ACLE, but could be useful in other situations.
And the change is simple.
Differential Revision: https://reviews.llvm.org/D85251