This instruction is a nop on all server cores (certainly on all
cores that AIX supports) so it is fine to emit a nop instead of it.
In fact, that is exactly what XL emits. So we emit a nop on AIX
and we leave the codegen as is on other platforms since there may
indeed be cores out there for which this actually does some prefetching.
This adds support to the X86 backend for the newly committed swiftasync
function parameter. If such a (pointer) parameter is present it gets stored
into an augmented frame record (populated in IR, but generally containing
enhanced backtrace for coroutines using lots of tail calls back and forth).
The context frame is identical to AArch64 (primarily so that unwinders etc
don't get extra complexity). Specfically, the new frame record is [AsyncCtx,
%rbp, ReturnAddr], and its presence is signalled by bit 60 of the stored %rbp
being set to 1. %rbp still points to the frame pointer in memory for backwards
compatibility (only partial on x86, but OTOH the weird AsyncCtx before the rest
of the record is because of x86).
Execute implementations already checks for permissions and existence
and returns relevant errors as necessary, so instead of printing our own errors,
we just print theirs.
This also fixes a case in windows where the driver might be missing the `.exe`
suffix. Previously, clangd would reject such a driver because sys::fs::exists is
strict, whereas the underlying Execute implementation would check with `.exe`
suffix too.
Fixes https://github.com/clangd/clangd/issues/93
Differential Revision: https://reviews.llvm.org/D102431
At the moment `MlirModule`s can be converted to `MlirOperation`s, but not
the other way around (at least not without going around the C API). This
makes it impossible to e.g. run passes over a `ModuleOp` created through
`mlirOperationCreate`.
Reviewed By: nicolasvasilache, mehdi_amini
Differential Revision: https://reviews.llvm.org/D102497
Ensure we tell getShiftAmountTy that we're working with pre-legalized types to prevent cases where the (legalized) shift type can no longer handle the (non-legalized) type width.
Fixes https://bugs.chromium.org/p/oss-fuzz/issues/detail?id=34366
Swift's new concurrency features are going to require guaranteed tail calls so
that they don't consume excessive amounts of stack space. This would normally
mean "tailcc", but there are also Swift-specific ABI desires that don't
naturally go along with "tailcc" so this adds another calling convention that's
the combination of "swiftcc" and "tailcc".
Support is added for AArch64 and X86 for now.
This patch adds support for inferred modules to the dependency scanner.
Effectively a cherry-pick of https://github.com/apple/llvm-project/pull/699 authored by @Bigcheese with libclang and other changes omitted.
Contains following changes:
1. [Clang][ScanDeps] Ignore __inferred_module.map dependency.
* This shows up with inferred modules, but it doesn't exist on disk, so don't report it as a dependency.
2. [Clang][ScanDeps] Use the module map a module was inferred from for inferred modules.
Also includes a smoke test that uses clang-scan-deps output to perform an explicit build. There's no intention to duplicate whatever `test/Modules` contains, just to verify the produced command-line does "work" (with very loose definition of work).
Split from D100934.
Reviewed By: dexonsmith
Differential Revision: https://reviews.llvm.org/D102495
Splitting the memref dialect lead to an introduction of several dependencies
to avoid compilation issues. The canonicalize pass also depends on the
memref dialect, but it shouldn't. This patch resolves the dependencies
and the unintuitive includes are removed. However, the dependency moves
to the constructor of the std dialect.
Differential Revision: https://reviews.llvm.org/D102060
Replace the templated linalgLowerOpToLoops method by three specialized methods linalgOpToLoops, LinalgOpToParallelLoops, and linalgOpToAffineLoops.
Differential Revision: https://reviews.llvm.org/D102324
AArch64's fctv* instructions implement the saturating behaviour that the
fpto*i.sat intrinsics require, in cases where the destination width
matches the saturation width. Lowering them removes a lot of unnecessary
generated code.
Only scalar lowerings are supported for now.
Differential Revision: https://reviews.llvm.org/D102353
This is just a dotest check to see if we can compile a simple program that uses
libc++. Right now we are parsing the rather big `algorithm` header in the test
program, but the test really just checks whether we can find *any* libc++
headers and link against some libc++ SO. Using the much smaller `cassert` header
for checking whether we can find libc++ headers speeds up this check by a bit.
After some incredibly unscientific performance testing this saves a few seconds
when running the test suite on Linux (on macOS we hardcoded that libc++ is
always there, so this check won't be used there and we don't save any time).
Reviewed By: jankratochvil
Differential Revision: https://reviews.llvm.org/D101056
Tweaks like DefineOutline depend on FS to be set at `apply()` time.
After https://reviews.llvm.org/D93978, tweaks run from Check tool lost
access to FS. This makes the available to apply() once again.
Differential Revision: https://reviews.llvm.org/D102519
This patch specifies a few guidelines that our API tests should follow.
The motivations for this are twofold:
1. API tests have unexpected pitfalls that especially new contributors run into
when writing tests. To prevent the frustration of letting people figure those
pitfalls out by trial-and-error, let's just document them briefly in one place.
2. It prevents some arguing about what is the right way to write tests. I really
like to have fast and reliable API test suite, but I also don't want to be the
bogeyman that has to insist in every review that the test should be rewritten to
not launch a process for no good reason. It's much easier to just point to a
policy document.
I omitted some guidelines that I think could be controversial (e.g., the whole
"should assert message describe failure or success").
Reviewed By: shafik
Differential Revision: https://reviews.llvm.org/D101153
This patch enables explicitly building inferred modules.
Effectively a cherry-pick of https://github.com/apple/llvm-project/pull/699 authored by @Bigcheese with libclang and dependency scanner changes omitted.
Contains the following changes:
1. [Clang] Fix the header paths in clang::Module for inferred modules.
* The UmbrellaAsWritten and NameAsWritten fields in clang::Module are a lie for framework modules. For those they actually are the path to the header or umbrella relative to the clang::Module::Directory.
* The exception to this case is for inferred modules. Here it actually is the name as written, because we print out the module and read it back in when implicitly building modules. This causes a problem when explicitly building an inferred module, as we skip the printing out step.
* In order to fix this issue this patch adds a new field for the path we want to use in getInputBufferForModule. It also makes NameAsWritten actually be the name written in the module map file (or that would be, in the case of an inferred module).
2. [Clang] Allow explicitly building an inferred module.
* Building the actual module still fails, but make sure it fails for the right reason.
Split from D100934.
Reviewed By: dexonsmith
Differential Revision: https://reviews.llvm.org/D102491
The select-of-constants transform was asserting that its constant vector
inputs did not implicitly truncate their input without that as an
explicit precondition to the function. This patch relaxes that assertion
into an early return to skip the optimization.
Reviewed By: RKSimon
Differential Revision: https://reviews.llvm.org/D102393
Currently the DexLimitSteps command requires at least one condition. This patch
lets users elide the condition to specify that the breakpoint range should
always be activated when the leading line is stepped on. This patch also
updates the terminology used in the `ConditionalController` class from the
terms 'conditional' and 'unconditional' to 'leading' and 'trailing' when
referring to the breakpoints in the DexLimitSteps range because the leading
breakpoint can now be unconditional.
Reviewed By: chrisjackson
Differential Revision: https://reviews.llvm.org/D101438
Remove the `ConditionalController._conditional_met` method. This was missed in
the recent ConditionalController refactor (D98699). We don't need to check that
the conditions for a conditional breakpoint have been met because
`DebuggerBase.get_triggered_breakpoint_ids` returns the set of ids for
breakpoints which have been triggered.
To get the "triggered breakpoints" from lldb we use `GetStopReasonDataCount`
and `GetStopReasonDataAtIndex`. It seems that these functions count all
breakpoints associated with the location which lldb has stopped at, regardless
of their condition. i.e. Even if we have two breakpoints at the same source
location that have mutually exclusive conditions, both will be found this way
when either condition is true. To get around this, we store a map of breakpoint
{id: condition} `_breakpoint_conditions` and evaluate the conditions of the
triggered breakpoints to filter the set down to those which are unconditional
or have a condition which evaluates to true.
Essentially we are just moving the condition double check from a general
debugger controller into the lldb specific wrapper. This tidy up will help make
upcoming patches simpler.
Reviewed By: chrisjackson
Differential Revision: https://reviews.llvm.org/D101431
This member function was introduced in 0a92e09c ([clang][deps] Generate the full command-line for modules) in order to keep the CompilerInvocation object alive after CompilerInstance goes out of scope. However, d3fb4b90 ([clang][deps] NFC: Report modules' context hash) removes that use-case, making this function dead.
This patch eagerly constructs and modifies CompilerInvocation of modular dependencies in order to report the correct context hash instead of the hash of the original translation unit.
No functionality change here, since we currently don't modify CompilerInvocation in a way that affects the context hash.
Depends on D102473.
Reviewed By: dexonsmith
Differential Revision: https://reviews.llvm.org/D102482
The context hash of modular dependencies can be different from the context hash of the original translation unit if we modify their `CompilerInvocation`s.
Stop assuming the TU's context hash everywhere.
No functionality change here, since we're still currently using the unmodified TU CompilerInvocation to compute the context hash.
Reviewed By: dexonsmith
Differential Revision: https://reviews.llvm.org/D102473
Add TransferWritePermutationLowering, which replaces permutation maps of TransferWriteOps with vector.transpose.
Differential Revision: https://reviews.llvm.org/D102548
With prelink inlining, pseudo probes with same ID can come from different inline contexts. Such probes should not share samples and their factors should be fixed up separately.
I'm seeing 0.3% speedup for SPEC2017 overall. Benchmark 631.deepsjeng_s benefits the most, about 4%.
Reviewed By: wenlei, wmi
Differential Revision: https://reviews.llvm.org/D102429
ScheduleDAGFast.cpp is compiled to object file, but the ScheduleDAGFast
object file isn't linked into clang executable file as no symbol is
referred by outside. Add calling to createXxx of ScheduleDAGFast.cpp,
then the ScheduleDAGFast object file will be linked into clang
executable file. The static RegisterScheduler will register scheduler
fast and linearize at clang boot time.
Reviewed By: pengfei
Differential Revision: https://reviews.llvm.org/D101601
D62727 removed GotEntrySize and GotPltEntrySize with a comment that they
are always equal to wordsize(), but that is not entirely true: X32 has a
word size of 4, but needs 8-byte GOT entries. This restores gotEntrySize
for both, adjusted for current naming conventions, but defaults it to
config->wordsize to keep things simple for architectures other than
x86_64.
This partially reverts D62727.
Reviewed By: MaskRay
Differential Revision: https://reviews.llvm.org/D102509
The ComplexPattern is looking for an immediate in a certain range
that has a single use. This can be handled with a PatLeaf since
we aren't matching multiple patterns or checking any complicated
relationships between nodes.
This shrinks the isel table a little bit since tablegen no longer
has to generate patterns with commuted operands. With the PatLeaf,
tablegen can see we're matching an immediate which should always
be on the right hand side of add.
Reviewed By: benshi001
Differential Revision: https://reviews.llvm.org/D102510
This adds a simple fold into codegenprepare that converts comparison of
branches towards comparison with zero if possible. For example:
%c = icmp ult %x, 8
br %c, bla, blb
%tc = lshr %x, 3
becomes
%tc = lshr %x, 3
%c = icmp eq %tc, 0
br %c, bla, blb
As a first order approximation, this can reduce the number of
instructions needed to perform the branch as the shift is (often) needed
anyway. At the moment this does not effect very much, as llvm tends to
prefer the opposite form. But it can protect against regressions from
commits like rG9423f78240a2.
Simple cases of Add and Sub are added along with Shift, equally as the
comparison to zero can often be folded with cpsr flags.
Differential Revision: https://reviews.llvm.org/D101778