They lead to -Wunused-command-line-argument and should be written as -Ttext=
instead, but the driver options end with a space. -Ttext=0 can be accepted by
the JoinedOrSeparate -T, so the JoinedOrSeparate -Ttext/etc are unneeded.
Based off the numbers from AMD SoG + Agner - vXi32 are both half-rate, and znver1 double-pumps the v8i32 op
We should have caught this earlier as many Intel models have half-rate pmulld already :-(
Fixes broken support for: `target.module[re.compile("libFoo")]`
There were two issues:
1. The type check was expecting `re.SRE_Pattern`
2. The expression to search the module path had a typo
In the first case, `re.SRE_Pattern` does not exist in Python 3, and is replaced
with `re.Pattern`.
While editing this code, I changed the type checks to us `isinstance`, which is
the conventional way of type checking.
From the docs on `type()`:
> The `isinstance()` built-in function is recommended for testing the type of an object, because it takes subclasses into account.
Differential Revision: https://reviews.llvm.org/D133130
This was achieved with an updated version of the 'cost-tables vs llvm-mca' script D103695
As we're using 'typical' worst case values, not all cost entries come from a single CPU - e.g. the latency/throughput from haswell but the size-latency(uops) from zen1/alderlake-e due to 'double pumping'
As the uop count (used for TCK_SizeAndLatency) for divss/divps is typically so low, we need to override isExpensiveToSpeculativelyExecute to ensure we keep fdiv calls behind branches - although for some very recent cpu targets it might not be necessary any more and could be relaxed.
Replace the following config attributes with `mlir_lib_dir`:
- `mlir_runner_utils_dir`
- `linalg_test_lib_dir`
- `spirv_wrapper_library_dir`
- `vulkan_wrapper_library_dir`
- `mlir_integration_test_dir`
I'm going to clean up substitutions in separate changes.
Reviewed By: aartbik, mehdi_amini
Differential Revision: https://reviews.llvm.org/D133217
These were affected by D131825 (and reported on Issue #57149) - adding the verification will help ensure that we don't hit this again on builds with EXPENSIVE_CHECKS enabled
CGP uses a raw `getInstructionCost(I, TargetTransformInfo::TCK_SizeAndLatency) >= TCC_Expensive` check to see if its better to move an expensive instruction used in a select behind a branch instead.
This is causing issues with upcoming improvements to TCK_SizeAndLatency costs on X86 as we need to use TCK_SizeAndLatency as an uop count (so its compatible with various target-specific buffer sizes - see D132288), but we can have instructions that have a low TCK_SizeAndLatency value but should still be treated as 'expensive' (FDIV for example) - by adding a isExpensiveToSpeculativelyExecute wrapper we can keep the current behaviour but still add an x86 override in a future patch when the cost tables are updated to compensate.
Matches numbers from AMD SoG + Agner - should always be on FPU Pipes 0+1, no additional uops for folded instructions and znver1 double pumps 256-bit vectors and is always latency = 4cy for f64 multiplies
Noticed while adding fmul CostKinds support to the x86 cost models in rG0735200e3f50 and znver1 wasn't being flagged as requiring 2uop for 256-bit vectors
This was achieved with an updated version of the 'cost-tables vs llvm-mca' script D103695
As we're using 'typical' worst case values, not all cost entries come from a single CPU - e.g. the latency/throughput from haswell but the size-latency(uops) from zen1/alderlake-e due to 'double pumping'
Current implementation of registerModuleReference() function not only
"registers" module reference, but also clones referenced module
(inside loadClangModule()). That may lead to cloning the module with
incorrect options (registerModuleReference() examines module references
and additionally accumulates MaxDwarfVersion and accel tables info).
Since accumulated options may differ from the current values,
it is incorrect to clone modules before options are fully accumulated.
This patch separates "cloning" code from "registering" code. So,
that accumulating option is done in the "registering stage" and
"cloning" is done after all modules are registered and options accumulated.
It also adds a callback for loaded compile units which can be used for
D132755 and D132371(to allow doing options accumulation outside
of DWARFLinker).
Differential Revision: https://reviews.llvm.org/D133047
When the opening brace of a control statement block is wrapped, we
must check the previous line to determine whether to try to merge
the block.
Fixes#38639.
Fixes#48007.
Fixes#57421.
Differential Revision: https://reviews.llvm.org/D133093
We want to move functionality from the LLVM ORCTargetProcess library into the
ORC runtime, and this will mean implementing remote-executor testing tools
(like llvm-jitlink-executor and lli-child-target) in the ORC runtime.
This patch refactors the ORC runtime build system to introduce an
add_orc_tool function that can be used to add new test tools. The code is
modeled on existing functions for adding unit tests.
A placeholder orc-rt-executor tool and test are added to verify that the
config changes behave as expected.
Reviewed By: phosek
Differential Revision: https://reviews.llvm.org/D133084
One of the test cases in that test is designed to test the compiling
jobs with a linking stage, but the PS4/PS5/Hexagon platform requires
an external linker that isn't present.
So this test do not support the "PS4/PS5/Hexagon".
-fmodules-local-submodule-visibility and -fdelayed-template-parsing
don't work properly together because the template is parsed in the
visibility context of the wrong module.
I discovered this additional bug at the end of working on D132906
In Sema::CheckCompletedCXXClass(...) uses a lambda CheckForDefaultedFunction to
verify each CXXMethodDecl holds to the expected invariants before passing them
on to CheckForDefaultedFunction.
It is currently missing a check that it is not deleted, this adds that check and
a test that crashed without this check.
This fixes: https://github.com/llvm/llvm-project/issues/57516
Differential Revision: https://reviews.llvm.org/D133177
1. This implementation change the default storing behavior of -ftime-trace only.
That is, if the compiling job contains the linking action, the executable file' s directory may be seem as the main work directory.
Thus the time trace files would be stored in the same directory of linking result.
By this approach, the user can easily get the time-trace files in the main work directory. The improved demo results:
```
$ clang++ -ftime-trace -o main.out /demo/main.cpp
$ ls .
main.out main-[random-string].json
```
2. In addition, the main codes of time-trace files' path inference have been refactored.
* The <path> of -ftime-trace=<path> is infered in clang driver
* After that, -ftime-trace=<path> can be added into clang's options
By this approach, the dirty work of path processing and judging can be implemented in driver layer, so that the clang may focus on its main work.
# $ clang -ftime-trace -o xxx.out xxx.cpp
Differential Revision: https://reviews.llvm.org/D131469
This commit adds support for interacting with a (valid) bytecode file in the same
way as .mlir. This allows editing, using all of the traditional LSP features, etc. but
still using bytecode as the on-disk serialization format. Loading a bytecode file this
way will fail if the bytecode is invalid, and saving will fail if the edited .mlir is invalid.
Differential Revision: https://reviews.llvm.org/D132970