This isn't perfect, since it doesn't use lazy_string - so if the
std::string does contain unprintable characters it might fail, but seems
better than nothing & LLVM doesn't generally store binary data in
std::strings.
Since a8b54834a1, there are two
distinct Windows path styles, `windows_backslash` (with the old
`windows` being an alias for it) and `windows_slash`.
4e4883e1f3 added helpers for
inspecting path styles.
The newly added windows_slash path style doesn't end up used in
LLDB yet anyway, as LLDB is quite decoupled from most of
llvm::sys::path and uses its own FileSpec class. To take it in
use, it could be hooked up in `FileSpec::Style::GetNativeStyle`
(in lldb/source/Utility/FileSpec.cpp) just like in the `real_style`
function in llvm/lib/Support/Path.cpp in
df0ba47c36.
It is not currently clear whether there's a real need for using
the Windows path style with forward slashes in LLDB (if there's any
other applications interacting with it, expecting that style), and
what other changes in LLDB are needed for that to work, but this
at least makes some of the checks more ready for the new style,
simplifying code a bit.
Differential Revision: https://reviews.llvm.org/D113255
This fixes lld/COFF/pdb-natvis.test (which only is run on Windows)
when using paths with forward slashes on Windows.
Differential Revision: https://reviews.llvm.org/D113265
These tests don't fail when only windows-dll is set in mingw mode, as the
bug is specific to MSVC mode.
Differential Revision: https://reviews.llvm.org/D112348
This is the 'or' sibling for the fold added with:
D113212
https://alive2.llvm.org/ce/z/tgnp7K
Note that neither of these transforms is poison-safe,
but it does not seem to matter at this level. We have
had the scalar version of D113212 for a long time, so
this is just making optimizer behavior consistent.
We do not have the scalar version of *this* fold,
however, so that is another follow-up.
Previously we didn't materialize conversions for arguments in certain
cases as the implicit type propagation was being heavily relied on
by many patterns. Now that those patterns have been fixed to
properly handle type conversions, we can drop the special behavior.
Differential Revision: https://reviews.llvm.org/D113233
This is to revert commit f95bd18b5f (Revert "[Attr] support
btf_type_tag attribute") plus a bug fix.
Previous change failed to handle cases like below:
$ cat reduced.c
void a(*);
void a() {}
$ clang -c reduced.c -O2 -g
In such cases, during clang IR generation, for function a(),
CGCodeGen has numParams = 1 for FunctionType. But for
FunctionTypeLoc we have FuncTypeLoc.NumParams = 0. By using
FunctionType.numParams as the bound to access FuncTypeLoc
params, a random crash is triggered. The bug fix is to
check against FuncTypeLoc.NumParams before accessing
FuncTypeLoc.getParam(Idx).
Differential Revision: https://reviews.llvm.org/D111199
Try simplify G_UADDO with 8 or 16 bit operands to wide G_ADD and TBNZ if
result is only used in the no-overflow case. It is restricted to cases
where we know that the high-bits of the operands are 0. If there's an
overflow, then the the 9th or 17th bit must be set, which can be checked
using TBNZ.
Reviewed By: paquette
Differential Revision: https://reviews.llvm.org/D111888
Correct the XLC/C++ version in the warning message to use the information from
XL's -qversion output.
Reviewed By: rzurob
Differential Revision: https://reviews.llvm.org/D112847
Fixes:
sanitizer_stacktrace.h:212:5: warning: ISO C++ forbids braced-groups within expressions [-Wpedantic]
Differential Revision: https://reviews.llvm.org/D113292
-f flags usually use the `=` form. -fdebug-compilation-dir= has been
around for a few months now (since 0c2bb6b446, both LLVM 12.0
and 13.0 have it), so using it shouldn't be a big problem -- especially
since use_relative_paths_in_debug_info is opt-in anyways.
Trying to update some options that don't at least have an inclusive language version.
This patch adds `objcmt-allowlist-dir-path` as a default alternative.
Reviewed By: akyrtzi
Differential Revision: https://reviews.llvm.org/D112591
These tests have pretty high O() complexity due to their nature,
which leads to potentially-long runtimes.
While in release build for me they took ~1 and ~2 sec,
as noted in https://reviews.llvm.org/D113214#inline-1080479
they take minutes in debug build.
Fine-tune the amount of permutations they deal with,
without affecting the test coverage. After this,
they take <~10ms each for me (in release build),
hopefully that is good-enough for debug build too.
Having a NoOpLoopNestPass can ensure that only outermost loop is invoked
for a LoopNestPass with a lit test.
There are some existing passes that are implemented as LoopNestPass, but
they are still using LOOP_PASS macro.
It would be easier to identify LoopNestPasses with a LOOPNEST_PASS
macro.
Differential Revision: https://reviews.llvm.org/D113185
This improves our type coverage. We were only testing integer
insert and extract before due to the FP types not being enabled for
arguments and returns.
Differential Revision: https://reviews.llvm.org/D113217
There is a combine in instcombine to transform a saturated add/sub into
a saddsat/ssubsat, currently handling inputs which are both sign
extended (https://alive2.llvm.org/ce/z/68qpTn). This can generalize to,
for example ashr of at least the bitwidth (https://alive2.llvm.org/ce/z/4TFyX-
and https://alive2.llvm.org/ce/z/qDWzFs for example). Which means it
generalizes further to "the number of sign bits", needing to be enough
to truncate to the size of the saturate. (An example using `or` for
instance: https://alive2.llvm.org/ce/z/EI_h_A).
So this patch makes use of ComputeNumSignBits (with the newly added
ComputeMinSignedBits) in matchSAddSubSat to generalize the fold to any
inputs with enough sign bits known, truncating the inputs to the new
size of the saturate.
Differential Revision: https://reviews.llvm.org/D112298
This patch add the conversion pattern for fir.extract_value
and fir.insert_value. fir.extract_value is lowered to llvm.extractvalue
anf fir.insert_value is lowered to llvm.insertvalue.
This patch also adds the type conversion for the BoxType and RecordType
needed to have some comprehensive tests.
This patch is part of the upstreaming effort from fir-dev branch.
Reviewed By: awarzynski
Differential Revision: https://reviews.llvm.org/D112961
Co-authored-by: Jean Perier <jperier@nvidia.com>
Co-authored-by: Eric Schweitz <eschweitz@nvidia.com>
The __block Objective-C pointers can be set but not used due to a commonly used lifetime extension pattern in Objective-C.
Differential Revision: https://reviews.llvm.org/D112850
This is possible after D106314 / 8773822c57.
Makes the required prepare-code-coverage-artifact.py invocation a bit longer,
but that seems like a good tradeoff.
Differential Revision: https://reviews.llvm.org/D113282
This introduces a new ComputeMinSignedBits method for ValueTracking that
returns the BitWidth - SignBits + 1 from ComputeSignBits, and represents
the minimum bit size for the value as a signed integer. Similar to the
existing APInt::getMinSignedBits method, this can make some of the
reasoning around ComputeSignBits more natural.
See https://reviews.llvm.org/D112298
Another minor step towards merging FoldConstantVectorArithmetic into FoldConstantArithmetic.
We don't use SDNodeFlags in any constant folding inside DAG, so passing the Flags argument is a waste of time - an alternative would be to wire up FoldConstantArithmetic to take SDNodeFlags just-in-case we someday start using it, but we don't have any way to test it and I'd prefer to avoid dead code.
Differential Revision: https://reviews.llvm.org/D113276
Even though AVX512's masked mem ops (unlike AVX1/2) have a mask
that is a `VF x i1`, replication of said masks happens after
promotion of it to `VF x i8`, so we should use `i8`, not `i1`,
when calculating the cost of mask replication.
(X s< 0) ? Y : 0 --> (X s>> BW-1) & Y
We canonicalize to the icmp+select form in IR, and we already have this fold
for scalar select in SDAG, so I think it's an oversight that we don't have
the fold for vectors. It seems neutral for AArch64 and saves some instructions
on x86.
Whether we should also have the sibling folds for the inverse condition or
all-ones true value may depend on target-specific factors such as whether
there's an "and-not" instruction.
Differential Revision: https://reviews.llvm.org/D113212
Avid readers of this saga may recall from previous installments,
that replication mask replicates (lol) each of the `VF` elements
in a vector `ReplicationFactor` times. For example, the mask for
`ReplicationFactor=3` and `VF=4` is: `<0,0,0,1,1,1,2,2,2,3,3,3>`.
More importantly, replication mask is used by LoopVectorizer
when using masked interleaved memory operations.
As discussed in previous installments, while it is used by LV,
and we **seem** to support masked interleaved memory operations on X86,
it's support in cost model leaves a lot to be desired:
until basically yesterday even for AVX512 we had no cost model for it.
As it has been witnessed in the recent
AVX2 `X86TTIImpl::getInterleavedMemoryOpCost()`
costmodel patches, while it is hard-enough to query the cost
of a particular assembly sequence [from llvm-mca],
afterwards the check lines LV costmodel tests must be updated manually.
This is, at the very least, boring.
Okay, now we have decent costmodel coverage for interleaving shuffles,
but now basically the same mind-killing sequence has to be performed
for replication mask. I think we can improve at least the second half
of the problem, by teaching
the `TargetTransformInfoImplCRTPBase::getUserCost()` to recognize
`Instruction::ShuffleVector` that are repetition masks,
adding exhaustive test coverage
using `-cost-model -analyze` + `utils/update_analyze_test_checks.py`
This way we can have good exhaustive coverage for cost model,
and only basic coverage for the LV costmodel.
This patch adds precise undef-aware `isReplicationMask()`,
with exhaustive test coverage.
* `InstructionsTest.ShuffleMaskIsReplicationMask` shows that
it correctly detects all the known masks.
* `InstructionsTest.ShuffleMaskIsReplicationMask_undef`
shows that replacing some mask elements in a known replication mask
still allows us to recognize it as a replication mask.
Note, with enough undef elts, we may detect a different tuple.
* `InstructionsTest.ShuffleMaskIsReplicationMask_Exhaustive_Correctness`
shows that if we detected the replication mask with given params,
then if we actually generate a true replication mask with said params,
it matches element-wise ignoring undef mask elements.
Reviewed By: spatel
Differential Revision: https://reviews.llvm.org/D113214
When created a UUNPKLO/HI node with an undef input then the
output should also be undef. I've added a target DAG combine
function to ensure we avoid creating an unnecessary uunpklo/hi
instruction.
Differential Revision: https://reviews.llvm.org/D113266
[NFC] This patch fixes URLs containing "master". Old URLs were either broken or
redirecting to the new URL.
Reviewed By: #libc, ldionne, mehdi_amini
Differential Revision: https://reviews.llvm.org/D113186
NumOps represents the number of elements for vector constant folding, rename this NumElts so in future we can the consistently use NumOps to represent the number of operands of the opcode.
Minor cleanup before trying to begin generalizing FoldConstantArithmetic to support opcodes other than binops.