Move `__quoted_output_proxy` into the one file that uses it.
A `const char*` has no associated traits class, so `std::quoted("literal")`
should be printable into any basic_ostream regardless of traits.
Use hidden-friend `operator<<` and `operator>>`, since we're permitted to.
(The exact signature is unspecified because the class itself is unspecified.)
We shouldn't support `std::quoted("literal")` in C++03 or C++11 mode.
(We do need `std::__quoted(s)` and `std::__quoted(cs)` in C++11 mode,
because they're used by `std::__fs::filesystem::path`.)
Differential Revision: https://reviews.llvm.org/D120135
Fix a number of issues with MCSymbolizer::tryAddingSymbolicOperand()
in X86Disassembler:
* Pass instruction size instead of immediate size.
* Correctly adjust the value of PC-relative operands.
* Set operand offset to zero when the operand is specified
implicitly.
Reviewed By: Amir, skan
Differential Revision: https://reviews.llvm.org/D121065
* Use default ref capture for non-escaping lambdas (this makes
maintenance easier by allowing new uses, removing uses, having
conditional uses (such as in assertions) not require updates to an
explicit capture list)
* Simplify addPrivate API not to take a lambda, since it calls it
unconditionally/immediately anyway - most callers are simply passing
in a named value or short expression anyway and the lambda syntax just
adds noise/overhead
Reviewed By: ABataev
Differential Revision: https://reviews.llvm.org/D121077
The motivation is to enable the MSVC-style JMC instrumentation usable by a ELF-based
debugger. Since there is no prior experience implementing JMC feature for ELF-based
debugger, it might be better to just reuse existing MSVC-style JMC instrumentation.
For debuggers that support both ELF&COFF (like lldb), the JMC implementation might
be shared between ELF&COFF. If this is found to inadequate, it is pretty low-cost
switching to alternatives.
Implementation:
- The '-fjmc' is already a driver and cc1 flag. Wire it up for ELF in the driver.
- Refactor the JMC instrumentation pass a little bit.
- The ELF handling is different from MSVC in two places:
* the flag section name is ".just.my.code" instead of ".msvcjmc"
* the way default function is provided: MSVC uses /alternatename; ELF uses weak function.
Based on D118428.
Reviewed By: rnk
Differential Revision: https://reviews.llvm.org/D119910
These unit tests resides in an internal repository. Porting the tests to the
public repository.
Reviewed By: aartbik
Differential Revision: https://reviews.llvm.org/D121021
The default lowering of vector transpose operations generates a large sequence of
scalar extract/insert operations, one pair for each scalar element in the input tensor.
In other words, the vector transpose is scalarized. However, there are transpose
patterns where one or more adjacent high-order dimensions are not transposed (for
example, in the transpose pattern [1, 0, 2, 3], dimensions 2 and 3 are not transposed).
This patch improves the lowering of those cases by not scalarizing them and extracting/
inserting a full n-D vector, where 'n' is the number of adjacent high-order dimensions
not being transposed. By doing so, we prevent the scalarization of the code and generate a
more performant vector version.
Paradoxically, this patch shouldn't improve the performance of transpose operations if
we are using LLVM. The LLVM pipeline is able to optimize away some of the extract/insert
operations and the SLP vectorizer is converting the scalar operations back to its vector
form. However, scalarizing a vector version of the code in MLIR and relying on the SLP
vectorizer to reconstruct the vector code again is highly undesirable for several reasons.
Reviewed By: nicolasvasilache, ThomasRaoux
Differential Revision: https://reviews.llvm.org/D120601
This is part of the implementation of the dataflow analysis framework.
See "[RFC] A dataflow analysis framework for Clang AST" on cfe-dev.
Reviewed-by: ymandel, xazax.hun
Differential Revision: https://reviews.llvm.org/D120984
... from a `uint64_t` to a `uint32_t`. (LLD-ELF uses a `uint32_t` too.)
About a 1.7% reduction in peak RSS when linking chromium_framework on my
3.2 GHz 16-Core Intel Xeon W Mac Pro, and no stat sig change in wall
time.
</Users/jezng/test2.sh ["before"]> </Users/jezng/test2.sh ["after"]> difference (95% CI)
RSS 1003036672.000 ± 9891065.259 985539505.231 ± 10272748.749 [ -2.3% .. -1.2%]
samples 27 26
base diff difference (95% CI)
sys_time 1.277 ± 0.023 1.277 ± 0.024 [ -0.9% .. +0.9%]
user_time 6.682 ± 0.046 6.598 ± 0.043 [ -1.6% .. -0.9%]
wall_time 5.904 ± 0.062 5.895 ± 0.063 [ -0.7% .. +0.4%]
samples 46 28
No appreciable change (~0.01%) in number of `equals` comparisons either:
Before:
ld64.lld: ICF needed 8 iterations
ld64.lld: equalsConstant() called 701643 times
ld64.lld: equalsVariable() called 3438526 times
After:
ld64.lld: ICF needed 8 iterations
ld64.lld: equalsConstant() called 701729 times
ld64.lld: equalsVariable() called 3438526 times
Reviewed By: #lld-macho, MaskRay, thakis
Differential Revision: https://reviews.llvm.org/D121052
The existing hashing of stubsHelperIndex has mostly been a no-op* for
some time now (ever since we made ICF run before dylib symbols get their
stubs indices assigned). I guess we could consider hashing the name +
filename of the DylibSymbol instead, but I'm not sure the overhead's
worth it... moreover, LLD/ELF only hashes their Defined symbols as well.
*: Technically it does change the hash value since stubsHelperIndex is
initialized to `UINT32_MAX` by default. But since all stubsHelperIndex
values are the same at when ICF runs, they don't add any useful
information to the hash.
This is debug code that is disabled by default. It'll provide a easy way
to figure out the impact (if any) of tweaking ICF's hashing algorithm
(since a poor quality hash will result in many more `equals*` calls).
Reviewed By: #lld-macho, oontvoo
Differential Revision: https://reviews.llvm.org/D121051
We can't just split by space, that's not going to give us the same
argv we'd have gotten from the shell, it could be in a string,
we must actually parse that as argv.
Materialize : i1 = extract_vector_elt t37, Constant:i64<0>
... into: "ptrue p, all" + PTEST
Test bit of lane 0 can use P register directly, and the instruction “pture all”
is loop invariant, which will beneficial to SVE after hoisting out the loop.
Reviewed By: david-arm, paulwalker-arm
Differential Revision: https://reviews.llvm.org/D120891
I'm a big fan of the autosuggestion feature but my terminal/color scheme
doesn't display faint any differently than regular lldb output, which
makes the feature a little confusing. This patch add a setting to change
the autosuggestion ANSI escape codes.
For example, to display the autosuggestion in italic, you can add this
to your ~/.lldbinit
settings set show-autosuggestion-ansi-prefix ${ansi.italic}
setting set show-autosuggestion-ansi-suffix ${ansi.normal}
Differential revision: https://reviews.llvm.org/D121064
When running llvm-bitcode-strip we want to remove the __LLVM
segment as well as the __bundle section when there are no other
sections in the segment.
Differential Revision: https://reviews.llvm.org/D120927
Zero-sized types are a GCC extension, also supported by Clang.
In theory it's already invalid to `delete` a void pointer or a
pointer-to-incomplete, so we shouldn't need any special code
to catch those cases; but in practice Clang accepts both
constructs with just a warning, and GCC even accepts `sizeof(void)`
with just a warning! So we must keep the static_asserts.
The hard errors are tested in `unique_ptr_dltr_dflt/*.compile.fail.cpp`.
In ranges::begin/end, check `sizeof >= 0` instead of `sizeof != 0`,
so as to permit zero-sized types while still disallowing incomplete
types.
Fixes#54100.
Differential Revision: https://reviews.llvm.org/D120633
This patch fixes the crash when printing some ops (like affine.for and
scf.for) when they are dumped in invalid state, e.g. during pattern
application. Now the AsmState constructor verifies the operation
first and switches to generic operation printing when the verification
fails. Also operations are now printed in generic form when emitting
diagnostics and the severity level is Error.
Reviewed By: rriddle, mehdi_amini
Differential Revision: https://reviews.llvm.org/D117834
Without passthru for now. Support for packed passthru requires
evl-into-mask folding.
Reviewed By: kaz7
Differential Revision: https://reviews.llvm.org/D120818
The analysis passes output function name encapsulated in `'` braces,
but LV uses `"`. Harmonizing this may help in creating an update script
for the LV costmodel test checks.
Reviewed By: fhahn
Differential Revision: https://reviews.llvm.org/D121105
This test is to make sure the return address registers, if clobbered in the
function or when the function has calls, are save/restored irrespective of
whether the IPRA is enabled/disabled.
This test is found to be not save/restore the return address registers, when
clobbered in the function, with the corresponding downstream changes of D114652.
The test could not be reduced further as the register allocator needs enough
register pressure so that it allocates the return address registers as well.
Reviewed By: arsenm
Differential Revision: https://reviews.llvm.org/D120922
It is not necessary to wait for all outstanding memory operations before
barriers on hardware that can back off of the barrier in the event of an
exception when traps are enabled. Add a new subtarget feature which
tracks which HW has this ability.
Reviewed By: #amdgpu, rampitec
Differential Revision: https://reviews.llvm.org/D120544
It was included in HasV8_0rOps when D88660 first introduced that
architecture definition. In D118045 I moved it out of there and into
ProcessorFeatures.R82, so that -mcpu=cortex-r82 would continue to
behave the same as before but -march=armv8-r would include only the
mandatory parts of the architecture.
In fact, that was a mistake. Firstly, Cortex-R82 _doesn't_ implement
that feature, so it makes no sense to deliberately enable it for that
CPU in particular. But also, it's an extension that only adds system
registers, and we're generally more relaxed about where we enable
those (because kernel developers find it useful to write sysreg-access
instructions after runtime checking, and because sysreg accesses
aren't manufactured during code generation so the risk is small).
So, in line with that usual AArch64 policy, FeatureSpecRestrict ought
to be considered part of 8.0-R for LLVM purposes. So I'm moving it
back into HasV8_0rOps, where it started out.
Reviewed By: lenary
Differential Revision: https://reviews.llvm.org/D120830
Fixes https://github.com/llvm/llvm-project/issues/54147.
When handling `AllowShortFunctionsOnASingleLine`, we were searching for the last line with a smaller level than the current line. The search was incorrect when the first line had the same level as the current one. This led to an unsatisfied assumption about the existence of a brace (non-comment token).
Reviewed By: HazardyKnusperkeks, owenpan
Differential Revision: https://reviews.llvm.org/D120902
We no longer traverse the underlying RecordTypeLoc directly, but rather
visit the UsingTypeLoc.
Differential Revision: https://reviews.llvm.org/D121103
When decomposing constraints for unsigned conditions, we can use
negative values by zero-extending them, as long as they are less than
the maximum constraint value.
Fixes https://github.com/llvm/llvm-project/issues/54224