This adds support for providing information when hovering over
operation names, variables, patters, constraints, and rewrites.
Differential Revision: https://reviews.llvm.org/D121542
This commits adds a basic language server for PDLL to enable providing
language features in IDEs such as VSCode. This initial commit only
adds support for tracking definitions, references, and diagnostics, but
followup commits will build upon this to provide more significant behavior.
In addition to the server, this commit also updates mlir-vscode to support
the PDLL language and invoke the server.
Differential Revision: https://reviews.llvm.org/D121541
Update functions that previously took a loop pointer but only to get the
pre-header. Instead, pass the block directly. This removes the
requirement for the loop object to be created up-front.
Rename hasCMPXCHG16B() to canUseCMPXCHG16B() to make it less like other
feature functions. Add a similar canUseCMPXCHG8B() that aliases
hasCX8() to keep similar naming.
Differential Revision: https://reviews.llvm.org/D121978
With debug information enabled (-g) Clang will wrap the actual target
region into a new function which is called from the "kernel". The problem
is that the "kernel" is now basically a wrapper without all the things
we expect. More importantly, if we end up asking for an AAKernelInfo
for the "target region function" we might try to turn it into SPMD mode.
That used to cause an assertion as that function doesn't have an
appropriately named `_exec_mode` global. While the global is going away
soon we still need to make sure to properly handle this case, e.g.,
perform optimizations reliably.
Differential Revision: https://reviews.llvm.org/D122043
These are the last™ changes to the tests for constexpr preparation.
Reviewed By: Quuxplusone, #libc, Mordante
Spies: Mordante, EricWF, libcxx-commits
Differential Revision: https://reviews.llvm.org/D120951
If a X86ISD::BLENDV op appears before legalization (in this test case due to the icmp_slt x, 0) its constant mask was being treated as a vselect mask (mask != 0) instead of blendv (mask < 0)
This just prevents constant folding entirely for non-VSELECT ops.
combineSelect doesn't expect X86ISD::BLENDV ops to appear before legalization and is treating the constant mask as a vselect mask (mask != 0) instead of blendv (mask < 0)
As noticed in D119654, by adding the masked intrinsics results together we can end up with the selects being canonicalized away from the intrinsic - this isn't what we want to test here so replace with a insertvalue chain into a aggregate instead to retain all the results.
We can use MOVMSK+TEST/BT to extract individual bool elements even if the index isn't constant
This relies on combineBitcastvxi1 so some AVX512 cases still aren't optimized as they avoid MOVMSK usage.
When N > 12, (2^N -1) is not a legal add immediate (isLegalAddImmediate will return false).
ANd if SetCC input use this number, DAG combiner will generate one more SRL instruction.
So combine [setcc (srl x, imm), 0, ne] to [setcc (and x, (-1 << imm)), 0, ne] to get better optimization in emitComparison
Fix https://github.com/llvm/llvm-project/issues/54283
Reviewed By: paulwalker-arm
Differential Revision: https://reviews.llvm.org/D121449
When inverting the compare predicate trySwapVSelectOperands is
incorrectly using the type of the select's cond operand rather
than the type of cond's operands. This means we're treating all
inversions as if they're integer.
Differential Revision: https://reviews.llvm.org/D121968
Before it only accepted one output iterator type. Now it accepts all
output iterator types as required by BasicFormatter.
Reviewed By: #libc, ldionne
Differential Revision: https://reviews.llvm.org/D120916
This patch adds triple support for:
* dxil architecture
* shadermodel OS (with version parsing)
* shader stages as environment
Reviewed By: MaskRay, pete
Differential Revision: https://reviews.llvm.org/D122031
This patch slightly updates the behavior of scf.if->select to
place any hoisted select statements prior to the remaining scf.if body.
This allows better composition with other canonicalization passes, such as
scf.if nested merging.
Reviewed By: ftynse
Differential Revision: https://reviews.llvm.org/D122027
Some refactoring and related fixes for more accurate
user program error recovery in the I/O runtime, especially
for error recovery with IOMSG= character values.
1) Move any work in an EndIoStatement() implementation
that may raise an error into a new CompleteOperation()
member function. This allows error handling APIs like
GetIoMsg() to complete a pending I/O statement and harvest
any errors that may result.
2) Move the pending error code from ErroneousIoStatementState
to a new pendingError_ data member in IoErrorHandler.
This allows IoErrorHandler::InError() to return a correct
result when there is a pending error that will be recovered
from so that I/O list data transfers don't crash in the meantime.
3) Don't create and leak a unit for a failed OPEN(NEWUNIT=n)
with error recovery, and don't modify 'n'. (Depends on
changes to API call ordering in lowering, in a separate patch;
code was added to ensure that OPEN statement control list
specifiers, e.g. SetFile(), must be passed before GetNewUnit().)
4) Fix the code that calls a form of strerror to fill an
IOMSG= variable so that it actually works for Fortran's
character type: blank fill with no null or newline termination.
Differential Revision: https://reviews.llvm.org/D122036
Support the names AND, OR, and XOR for the generic intrinsic
functions IAND, IOR, and IEOR respectively.
Differential Revision: https://reviews.llvm.org/D122034
In flang/runtime/transformational.cpp, there are many RUNTIME_CHECK assertions
for errors that should have been caught in semantics, but there are alno others
that signify program errors that in principle cannot be detected until
execution. Convert this second group into readable fatal error messages.
Also clean up some missing braces and incorrect printf formats found
along the way.
Differential Revision: https://reviews.llvm.org/D122037
This patch introduces a generic helper class that will listen for
event in a background thread and match it against a source broadcaster.
If the event received matches the source broadcaster, the event is
queued up in a list that the user can access later on.
The motivation behind this is to easily test new kinds of events
(i.e. Swift type-system progress events). However, this patch also
updates `TestProgressReporting.py` and `TestDiagnosticReporting.py`
to make use of this new helper class.
Differential Revision: https://reviews.llvm.org/D121977
Signed-off-by: Med Ismail Bennani <medismail.bennani@gmail.com>
Extend "extension<LanguageFeature>()" to incorporate an explanatory
message better than the current generic "nonstandard usage:".
Differential Revision: https://reviews.llvm.org/D122035
Regression from 2f497ec3; we should not try to generate ldrexd on
targets that don't have it.
Also, while I'm here, fix shouldExpandAtomicStoreInIR, for consistency.
That doesn't really have any practical effect, though. On Thumb targets
where we need to use __sync_* libcalls, there is no libcall for stores,
so SelectionDAG calls __sync_lock_test_and_set_8 anyway.
SLP is currently assuming that control dependence in these cases is irrelevant. This is only valid if none of the lib-funcs involved can throw or infinite loop in the scalar forms. This appears to be true (or at least we infer the respective attributes) for the libfuncs I spot checked. This change is mostly for shrunking the diff on an upcoming patch.
ExpandShapeOp builder cannot infer the result type since it doesn't know
how the dimension needs to be split. Remove this builder so that it
doesn't get used accidently. Also remove one potential path using it in
generic fusion.
Differential Revision: https://reviews.llvm.org/D122019
Generalize D99629 for ELF. A default visibility non-local symbol is preemptible
in a -shared link. `isInterposable` is an insufficient condition.
Moreover, a non-preemptible alias may be referenced in a sub constant expression
which intends to lower to a PC-relative relocation. Replacing the alias with a
preemptible aliasee may introduce a linker error.
Respect dso_preemptable and suppress optimization to fix the abose issues. With
the change, `alias = 345` will not be rewritten to use aliasee in a `-fpic`
compile.
```
int aliasee;
extern int alias __attribute__((alias("aliasee"), visibility("hidden")));
void foo() { alias = 345; } // intended to access the local copy
```
While here, refine the condition for the alias as well.
For some binary formats like COFF, `isInterposable` is a sufficient condition.
But I think canonicalization for the changed case has little advantage, so I
don't bother to add the `Triple(M.getTargetTriple()).isOSBinFormatELF()` or
`getPICLevel/getPIELevel` complexity.
For instrumentations, it's recommended not to create aliases that refer to
globals that have a weak linkage or is preemptible. However, the following is
supported and the IR needs to handle such cases.
```
int aliasee __attribute__((weak));
extern int alias __attribute__((alias("aliasee")));
```
There are other places where GlobalAlias isInterposable usage may need to be
fixed.
Reviewed By: rnk
Differential Revision: https://reviews.llvm.org/D107249
With D107249 I saw huge compile time regressions on a module (150s ->
5700s). This turned out to be due to a huge RefSCC in
the module. As we ran the function simplification pipeline on functions
in the SCCs in the RefSCC, some of those SCCs would be split out to
their RefSCC, a child of the current RefSCC. We'd skip the remaining
SCCs in the huge RefSCC because the current RefSCC is now the RefSCC
just split out, then revisit the original huge RefSCC from the
beginning. This happened many times because many functions in the
RefSCC were optimizable to the point of becoming their own RefSCC.
This patch makes it so we don't skip SCCs not in the current RefSCC so
that we split out all the child RefSCCs on the first iteration of
RefSCC. When we split out a RefSCC, we invalidate the original RefSCC
and add the remainder of the SCCs into a new RefSCC in
RCWorklist. This happens repeatedly until we finish visiting all
SCCs, at which point there is only one valid RefSCC in
RCWorklist from the original RefSCC containing all the SCCs that
were not split out, and we visit that.
For example, in the newly added test cgscc-refscc-mutation-order.ll,
we'd previously run instcombine in this order:
f1, f2, f1, f3, f1, f4, f1
Now it's:
f1, f2, f3, f4, f1
This can cause more passes to be run in some specific cases,
e.g. if f1<->f2 gets optimized to f1<-f2, we'd previously run f1, f2;
now we run f1, f2, f2.
This improves kimwitu++ compile times by a lot (12-15% for various -O3 configs):
https://llvm-compile-time-tracker.com/compare.php?from=2371c5a0e06d22b48da0427cebaf53a5e5c54635&to=00908f1d67400cab1ad7bcd7cacc7558d1672e97&stat=instructions
Reviewed By: asbirlea
Differential Revision: https://reviews.llvm.org/D121953
ASSERT_THAT_EXPECTED implicitly calls takeError(), and calling
takeError() a second time returns nothing, so the check for the
content of the error text wasn't being executed.
Fixes Issue #48901
Found by the Rotten Green Tests project.
Fixes llvm-clang-x86_64-expensive-checks-debian failure with 2f497ec3.
expandAtomicStore always modifies the function, so make sure we set
MadeChange unconditionally. Not sure how nobody else has stumbled over
this before.
This is a follow up to 565603cc94,
which made macOS the default target OS for `-arch arm64` when
running on an Apple Silicon Mac. Now it'll be the default when
running on an Intel Mac too.
clang/test/Driver/apple-arm64-arch.c was a bit odd before: it was added
for the above commit, but tested the inverse behaviour and XFAIL'ed on
Apple Silicon. This inverts it to the (new) behaviour (that's now
correct regardless) and removes the XFAIL.
Radar-Id: rdar://90500294