This change looks for cases where we can prove that an exit test of a loop can be performed in a narrower bitwidth, and that by doing so we can replace a loop-varying extend with a loop-invariant truncate.
The motivation here is that doing this unblocks the trip count analysis for narrow IVs involved in extended compare exit tests. It also has the nice side effect of simply making the code faster, even if we gain no other benefit from the improved analysis ability.
I've noted a few places this could be extended, but I think this stands reasonable on it's own as well.
Differential Revision: https://reviews.llvm.org/D112262
I'm not sure what the history is here but this test passes on macOS
today. It seems like we should unify these tests if they need to run
cross platform.
Reviewed By: #lld-macho, int3
Differential Revision: https://reviews.llvm.org/D113085
The main benefits of this change are faster access to operands
(no need to compute the offset, as it is now right after the
operation), simpler code(no need to manage a lot of the "is the
operand storage trailing" logic we had to before). The major
downside to this though, is that operand holding operations now
grow in size by 1 word (as no matter how we do this change, there
will need to be some additional book keeping).
Differential Revision: https://reviews.llvm.org/D111695
On our large iOS project this took a link from 1 minute 45 seconds to 45
seconds. For reference ld64 does the same link in ~20 seconds.
Reviewed By: #lld-macho, int3
Differential Revision: https://reviews.llvm.org/D113063
A quick grep for NDEBUG in MLIR revealed a use in DebugActions.h that breaks ABI. This patch changes the use of NDEBUG to LLVM_ENABLE_ABI_BREAKING_CHECKS which has the advantage of being independent of whether clients build their own app in debug or release as it is purely dependant on how MLIR itself was built.
Differential Revision: https://reviews.llvm.org/D113088
Read each pointer in the argv and envp arrays before dereferencing
it; this correctly marks an error when these pointers point into
memory that has been freed.
Differential Revision: https://reviews.llvm.org/D113046
This patch document llvm/utils/update_* python scripts that are used to generate
assertions in many of the LLVM regression test cases.
Reviewed By: MaskRay
Differential Revision: https://reviews.llvm.org/D112936
The check for whether a zero extension was needed was subtly wrong and
saw a value that was already 64 bits, so did not extend.
Fixes PR52357.
Reviewed By: RKSimon
Differential Revision: https://reviews.llvm.org/D112860
This casued miscompiles of switches, see comments on the code review.
> This extends `optimizeCompareInstr` to re-use previous comparison
> results if the previous comparison was with an immediate that was 1
> bigger or smaller. Example:
>
> CMP x, 13
> ...
> CMP x, 12 ; can be removed if we change the SETg
> SETg ... ; x > 12 changed to `SETge` (x >= 13) removing CMP
>
> Motivation: This often happens because SelectionDAG canonicalization
> tends to add/subtract 1 often when optimizing for fallthrough blocks.
> Example for `x > C` the fallthrough optimization switches true/false
> blocks with `!(x > C)` --> `x <= C` and canonicalization turns this into
> `x < C + 1`.
>
> Differential Revision: https://reviews.llvm.org/D110867
This reverts commit e2c7ee0743.
I don't really buy that masked interleaved memory loads/stores are supported on X86.
There is zero costmodel test coverage, no actual cost modelling for the generation
of the mask repetition, and basically only two LV tests.
Additionally, i'm not very interested in AVX512.
I don't know if this really helps "soft" block over at
https://reviews.llvm.org/D111460#inline-1075467,
but i think it can't make things worse at least.
When we are being told that there is a masking, instead of
completely giving up and falling back to
fully scalarizing `BasicTTIImplBase::getInterleavedMemoryOpCost()`,
let's correctly query the cost of masked memory ops,
keep all the pretty shuffle cost modelling,
but scalarize the cost computation for the mask replication.
I think, not scalarizing the shuffles themselves
may adjust the computed costs a bit,
and maybe hopefully just enough to hide the "regressions"
at https://reviews.llvm.org/D111460#inline-1075467
I do mean hide, because the test coverage is non-existent.
Reviewed By: RKSimon
Differential Revision: https://reviews.llvm.org/D112873
Windows member functions have __attribute__((thiscall)) on their type,
so any machine running this that is 32 bit windows fails this test, add
a wildcard, plus an additional run line to explain why.
As it can be seen in `InnerLoopVectorizer::vectorizeInterleaveGroup()`,
in some cases (reported by `UseMaskForGaps`), the gaps in the interleaved load/store group
will be masked away by another constant mask, so there is no need to
account for the cost of replication of the mask for these.
Differential Revision: https://reviews.llvm.org/D112877
Copied from llvm/test/Transforms/LoopVectorize/X86/x86-interleaved-accesses-masked-group.ll
As discussed in D111460 / D112877 / D112873 we have basically no test coverage
for this part of cost model.
The patch in D112349 added a previously nonexistant restriction on ifunc
resolvers that they MUST be defintions. However, the function
multiversioning depends on being able to resolve these resolvers at
link-time, so this additional restriction was breaking.
This reverts commit 5fbcf67734.
ProcessDebugger is used in ProcessWindows and NativeProcessWindows.
I thought I was simplifying things by renaming to DoGetMemoryRegionInfo
in ProcessDebugger but the Native process side expects "GetMemoryRegionInfo".
Follow the pattern that WriteMemory uses. So:
* ProcessWindows::DoGetMemoryRegioninfo calls ProcessDebugger::GetMemoryRegionInfo
* NativeProcessWindows::GetMemoryRegionInfo does the same
The recipe produces exactly one VPValue and can inherit directly from
it. This is in line with other recipes and avoids having to use
getVPSingleValue.
A reserved register:
- is not allocatable
- is considered always live
- is ignored by liveness tracking
NVPTX special registers match the criteria, and marking them as
reserved helps to avoid machine verifier error:
*** Bad machine code: Using an undefined physical register ***
- function: foo
- basic block: %bb.0 (0x557bb178b708)
- instruction: %0:int32regs = MOV_SPECIAL $envreg0
- operand 1: $envreg0
Differential Revision: https://reviews.llvm.org/D113008
Add a warning to TableGen for unused template arguments in classes and
multiclasses, for example:
multiclass Foo<int x> {
def bar;
}
$ llvm-tblgen foo.td
foo.td:1:20: warning: unused template argument: Foo::x
multiclass Foo<int x> {
^
A flag '--no-warn-on-unused-template-args' is added to disable the
warning. The warning is disabled for LLVM and sub-projects if
'LLVM_ENABLE_WARNINGS=OFF'.
Reviewed By: RKSimon
Differential Revision: https://reviews.llvm.org/D109359
TargetExternalSymbol is considered to be an immediate and not a
register, so machine verifier emits an error:
*** Bad machine code: Expected a register operand. ***
- function: static_offset
- basic block: %bb.0 bb (0x560e9b306028)
- instruction: %3:int64regs = MoveParamI64 &static_offset_param_1
- operand 1: &static_offset_param_1
The patch adds variants of this instruction with an immediate operand
for byval arguments on 64-bit and 32-bit targets.
Differential Revision: https://reviews.llvm.org/D113006
LLVM has the habit of turning adds with no common bits set into ors,
which means we need to detect them and treat them like adds again in the
MVE gather/scatter lowering pass.
Differential Revision: https://reviews.llvm.org/D112922
On AArch64 we have various things using the non address bits
of pointers. This means when you lookup their containing region
you won't find it if you don't remove them.
This changes Process GetMemoryRegionInfo to a non virtual method
that uses the current ABI plugin to remove those bits. Then it
calls DoGetMemoryRegionInfo.
That function does the actual work and is virtual to be overriden
by Process implementations.
A test case is added that runs on AArch64 Linux using the top
byte ignore feature.
Reviewed By: omjavaid
Differential Revision: https://reviews.llvm.org/D102757