This change reorganizes the code and comments to make the expected semantics of these routines more clear. However, this is *not* an NFC change. The functional change is having isScalarWithPredication return false if the instruction does not need predicated. Specifically, for the case of a uniform memory operation we were previously considering it *not* to be a predicated instruction, but *were* considering it to be scalable with predication.
As can be seen with the test changes, this causes uniform memory ops which should have been lowered as uniform-per-parts values to instead be lowering via naive scalarization or if scalarization is infeasible (i.e. scalable vectors) aborted entirely. I also don't trust the code to bail out correctly 100% of the time, so it's possible we had a crash or miscompile from trying to scalarize something which isn't scalaralizable. I haven't found a concrete example here, but I am suspicious.
Differential Revision: https://reviews.llvm.org/D131093
The code generated for this version of the intrinsic is broken. I'm
marking it as a "TODO" for now so that people don't get unannounce bad
results.
Differential Revision: https://reviews.llvm.org/D132082
* Replace getUserCost with getInstructionCost, covering all cost kinds.
* Remove getInstructionLatency, it's not implemented by any backends, and we should fold the functionality into getUserCost (now getInstructionCost) to make it easier for targets to handle the cost kinds with their existing cost callbacks.
Original Patch by @samparker (Sam Parker)
Differential Revision: https://reviews.llvm.org/D79483
This patch fixes a crash when trying to emit a constant compound literal.
For C++ Clang evaluates either casts or binary operations at translation time,
but doesn't pass on the InConstantContext information that was inferred when
parsing the statement. Because of this, strict FP evaluation (-ftrapping-math)
which shouldn't be in effect yet, then causes checkFloatingpointResult to return
false, which in tryEmitGlobalCompoundLiteral will trigger an assert that the
compound literal wasn't constant.
The discussion here around 'manifestly constant evaluated contexts' was very
helpful to me when trying to understand what LLVM's position is on what
evaluation context should be in effect, together with the explanatory text in
that patch itself:
https://reviews.llvm.org/D87528
Reviewed By: rjmccall, DavidSpickett
Differential Revision: https://reviews.llvm.org/D131555
This fixes a bug in SCF/AffineCanonicalizationUtils.cpp. Loop lb/ub were previously considered dimensions, which caused a crash when a (non-optimizable) affine.min / affine.max expression was processed (due to multiplication of two dims). Lb/ub are now considered symbols and symbols may be multiplied. (The scope of the analysis is "within the loop body", at which point lb/ub are constants.)
Differential Revision: https://reviews.llvm.org/D132021
bufferization.to_memref ops are not supported in One-Shot Analysis. They often trigger a failed assertion that can be confusing. Instead, scan for to_memref ops before running the analysis and immediately abort with a proper error message.
Differential Revision: https://reviews.llvm.org/D132027
Part of the schedule model is not accurate, so need a initial test record the changes.
This assemble list is refer to the basic part of D128631
Reviewed By: dmgreen
Differential Revision: https://reviews.llvm.org/D132103
Arrange the operations in alphabetical order, rather than what
appears to be the order they were added. This was suggested in
a review when adding new operations.
Reviewed By: rriddle
Differential Revision: https://reviews.llvm.org/D132046
A number of mlir tests `FAIL` on Solaris/sparcv9 with `Target has no JIT
support`. This patch fixes that by mimicing `clang/test/lit.cfg.py` which
implements a `host-supports-jit` keyword for this. The gtest-based unit
tests don't support `REQUIRES:`, so lack of support needs to be hardcoded
there.
Tested on `amd64-pc-solaris2.11` (`check-mlir` results unchanged) and
`sparcv9-sun-solaris2.11` (only one unrelated failure left).
Differential Revision: https://reviews.llvm.org/D131151
Remove hardcoded usage of memref.alloc/dealloc ops in affine utils; use memory
effects instead. This is NFC for existing test cases, but strictly more
general/powerful in terms of functionality: it allows other allocation and
freeing ops (like gpu.alloc/dealloc) to work in conjunction with affine ops and
utilities.
Differential Revision: https://reviews.llvm.org/D132011
This commit adds support for 0-D vectors in ReductionOp.
Reviewed By: nicolasvasilache, dcaballe
Differential Revision: https://reviews.llvm.org/D131896
The X86SchedAlderlakeP.td file is automatically generated by schedtool
(D130897). Most of instruction's scheduling information is based on
measured ADL-P data in uops.info. Some data is from GLC tpt/lat data
provided by intel doc. The rest instruction's scheduling information is
from skylake client schedule model in order to get a relative complete
model.
Reviewed By: RKSimon
Differential Revision: https://reviews.llvm.org/D130959
We deferred the evaluation of dependent immediate invocations in https://reviews.llvm.org/D119375 until instantiation.
We should also not consider them referenced from a non-consteval context.
Fixes: https://github.com/llvm/llvm-project/issues/55601
```
template<typename T>
class Bar {
consteval static T x() { return 5; }
public:
Bar() : a(x()) {}
private:
int a;
};
Bar<int> g();
```
Is now accepted by clang. Previously it errored with: `cannot take address of consteval function 'x' outside of an immediate invocation Bar() : a(x()) {}`
Differential Revision: https://reviews.llvm.org/D132031
This commit adds the definitions for `dyld_chained_starts_in_image`,
`dyld_chained_starts_in_segment`, and related enums. Dumping their
contents is possible with the -chained_fixups flag of llvm-otool.
The chained-fixups.yaml test was changed to cover bindings/rebases, as
well as weak imports, weak symbols and flat namespace symbols. Now that
we have actual fixup entries, the __DATA segment contains data that
would need to be hexdumped in YAML. We also test empty pages (to look
for the "DYLD_CHAINED_PTR_START_NONE" annotation), so the YAML would end
up quite large. So instead, this commit includes a binary file.
When Apple's effort to upstream their chained fixups code continues,
we'll replace this code with the then-upstreamed code. But we need
something in the meantime for testing ld64.lld's chained fixups code.
Differential Revision: https://reviews.llvm.org/D131961
Add a proper TODO for the REDUCE instrinsic instead of crashing.
Reviewed By: PeteSteinfeld, vdonaldson
Differential Revision: https://reviews.llvm.org/D132020
This patch adds support for part of Zc extension which will be frozen soon.
This extension is designed to continue reducing the binary size of RISC-V programs.
In this patch:
`Zca` is a subset of C extension instructions that are compatible with the Zc extension.
The spec of Zc ext is [[ https://github.com/riscv/riscv-code-size-reduction/releases | Here ]]
Reviewed By: craig.topper
Differential Revision: https://reviews.llvm.org/D130141
This implements support for using libc++ headers in MSVC toolchain.
We only support libc++ headers that are part of the toolchain, and
not headers installed elsewhere on the system.
Differential Revision: https://reviews.llvm.org/D101479
With reduced indentation, some strings can be reformatted to take less lines.
Also strategically apply `formatv` to shorten them.
Reviewed By: rafauler
Differential Revision: https://reviews.llvm.org/D132088
Split out the body of a for-loop in `RewriteInstance::readRelocations` into a
separate function (`handleRelocation`). It's still over 300 lines of code,
so it's worth splitting down further.
Reviewed By: rafauler
Differential Revision: https://reviews.llvm.org/D132078
NewGVN tables are not cleared out between the initial run of NewGVN and the verification. In case of phi-of-ops optimization, OpSafeForPHIOfOps goes out of sync between the two runs. One operand might not be safe for one basic block, but it might be safe for one of its successors. In this case, the operand will be added in OpSafeForPHIOfOps map. In verification phase, we reuse OpSafeForPHIOfOps without updating it again. As a result, the operand will be considered safe for phi-of-ops optimization even for the case that it is not. This patch fixes this problem.
Fix for 53807.
Reviewed By: asbirlea
Differential Revision: https://reviews.llvm.org/D130910
Commit 9189a26664 caused llvm-jitlink to create bare JITDylibs to wrap real
dylibs loaded via -preload. This exposed a bug in MachOPlatform where we
assumed that all JITDylibs had been registered with the platform through
MachOPlatform::setupJITDylib (bare JITDylibs are _not_ run through this
function), and errored out where this was not the case.
This bug in MachOPlatform was causing test failures in compilert-rt:
Failed Tests (2):
ORC-x86_64-darwin :: TestCases/Darwin/x86-64/trivial-objc-methods.S
ORC-x86_64-darwin :: TestCases/Darwin/x86-64/trivial-swift-types-section.S
This commit fixes the issue by skipping JITDylibs that haven't been registered
with the platform via MachOPlatform::setupJITDylib.
This change omits default values from the sparse tensor type,
saving considerable text real estate for the common cases.
Reviewed By: Peiming
Differential Revision: https://reviews.llvm.org/D132083
Move logging into LLVM_DEBUG scope.
Remove redundant printing of jump table parents:
Old logging:
```
failed to analyze jump table in function _ZN12_GLOBAL__N_116InitHeaderSearch23Ad
dDefaultCIncludePathsERKN4llvm6TripleERKN5clang19HeaderSearchOptionsE/1(*2)
PIC Jump table JUMP_TABLE/_ZN12_GLOBAL__N_116InitHeaderSearch23AddDefaultCInclud
ePathsERKN4llvm6TripleERKN5clang19HeaderSearchOptionsE/1.1 for function _ZN12_GL
OBAL__N_116InitHeaderSearch23AddDefaultCIncludePathsERKN4llvm6TripleERKN5clang19
HeaderSearchOptionsE/1(*2) at 0x65996e0 with a total count of 0:
0x9dc
next jump table at 0x659a810 belongs to function _ZN5clang5Lexer40LexDependencyD
irectiveTokenWhileSkippingERNS_5TokenE
PIC Jump table JUMP_TABLE/_ZN5clang5Lexer40LexDependencyDirectiveTokenWhileSkipp
ingERNS_5TokenE.0 for function _ZN5clang5Lexer40LexDependencyDirectiveTokenWhile
SkippingERNS_5TokenE at 0x659a810 with a total count of 0:
jump table heuristic failure
```
New logging:
```
failed to analyze PIC Jump table JUMP_TABLE/_ZN12_GLOBAL__N_116InitHeaderSearch2
3AddDefaultCIncludePathsERKN4llvm6TripleERKN5clang19HeaderSearchOptionsE/1.1 for
function _ZN12_GLOBAL__N_116InitHeaderSearch23AddDefaultCIncludePathsERKN4llvm6T
ripleERKN5clang19HeaderSearchOptionsE/1(*2) at 0x65996e0 with a total count of 0:
absolute offset: 0x52ac58c
next PIC Jump table JUMP_TABLE/_ZN5clang5Lexer40LexDependencyDirectiveTokenWhile
SkippingERNS_5TokenE.0 for function _ZN5clang5Lexer40LexDependencyDirectiveToken
WhileSkippingERNS_5TokenE at 0x659a810 with a total count of 0:
jump table heuristic failure
```
Reviewed By: rafauler
Differential Revision: https://reviews.llvm.org/D131243
We were resetting DW_AT_low_pc to zero when DW_AT_high_pc was zero, or
DW_AT_low_pc == DW_AT_high_pc. This resulted in LLDB to print error "adding
range [0x0-0x0) which has a base that is less than the function's low PC".
Changed it so that when this case arises we set DW_AT_low_pc to the start
address.
Reviewed By: rafauler
Differential Revision: https://reviews.llvm.org/D132059
Specifying `-regalloc=fast` is not reliable. With fast register allocation,
`LIS = getAnalysisIfAvailable<LiveIntervals>();` get nullptr
in "si-lower-sgpr-spills" pass, so the slot index is not created in the
pass for new inserted instructions. When verifying the machine
instructions, it fails on checking slot index. While greedy-ra is time
consuming basic-ra can be used to reduce compiling time for this test case.
Differential Revision: https://reviews.llvm.org/D131931