Summary:
This patch improves the implementation of D100774 by replacing the global
variable introduced with a function that returns a reference to an internal
one. This removes the need to define the variable in every plugin that uses it.
Reviewed By: JonChesterfield
Differential Revision: https://reviews.llvm.org/D101102
D101114 enforced proper version checks, which exposed a variety of version
mismatch issues in our tests. We previously changed the test inputs to
target 10.0, which was the simpler thing to do, but we should really
just have our lit.local.cfg default to targeting 10.15, which is what is done
here. We're not likely to ever have proper support for the older versions
anyway, as that would require more work for unclear benefit; for instance,
llvm-mc seems to generate a different compact unwind format for older macOS
versions, which would cause our compact-unwind.s test to fail.
Targeting 10.15 by default causes the following behavioral changes:
* `__mh_execute_header` is now a section symbol instead of an absolute symbol
* LC_BUILD_VERSION gets emitted instead of LC_VERSION_MIN_MACOSX. The former is
32 bytes in size whereas the latter is 16 bytes, so a bunch of hardcoded
address offsets in our tests had to be updated.
* >= 10.6 executables are PIE by default
Note that this diff was stacked atop of a local revert of most of the test
changes in rG8c17a875150f8e736e8f9061ddf084397f45f4c5, to make review easier.
Reviewed By: #lld-macho, oontvoo
Differential Revision: https://reviews.llvm.org/D101119
There was a missing isInvalid() check leading to an attempt to
instantiate template with an empty instantiation stack.
Differential Revision: https://reviews.llvm.org/D100675
This patch adds support for both scalable- and fixed-length vector code
lowering of the llvm.minnum and llvm.maxnum intrinsics to the equivalent
RVV instructions.
Reviewed By: craig.topper
Differential Revision: https://reviews.llvm.org/D101035
Add optional keyword argument 'on_line' to DexLabel to label the specifed line
instead of the line the command is found on.
This will be helpful when used alongside DexDeclareFile (D99651).
Reviewed By: TWeaver
Differential Revision: https://reviews.llvm.org/D101055
This recommits 4f5da356ff, including
explicit implementations of move a constructor and deleted copy
constructors/assignment operators, to fix failures with some compilers.
This reverts the revert 74854d00e8.
Previous build failures were caused by an error in bitcode reading and
writing for DIArgList metadata, which has been fixed in e5d844b587.
There were also some unnecessary asserts that were being triggered on
certain builds, which have been removed.
This reverts commit dad5caa59e.
The previous D101039 didn't fix the SmallSet insertion issue, due to we
always return false for the comparison between 2 different nonnull BBs.
This patch makes the the comparison to be complete by comparing `MBB`
first, so that we can always get the invariant order by a single
operator.
Mask vectors are handled similar to data vectors in N-D TransferWriteOp. They are copied into a temporary memory buffer, which can be indexed into with non-constant values.
Differential Revision: https://reviews.llvm.org/D101136
LLVM should be smarter about *known* malloc's alignment and this knowledge may enable other optimizations.
Originally started as LLVM patch - https://reviews.llvm.org/D100862 but this logic should be really in Clang.
Reviewed By: rjmccall
Differential Revision: https://reviews.llvm.org/D100879
This commit adds support for broadcast dimensions in permutation maps of vector transfer ops.
Also fixes a bug in VectorToSCF that generated incorrect in-bounds checks for broadcast dimensions.
Differential Revision: https://reviews.llvm.org/D101019
If we are using a simplified value, we need to add an extra
dependency this value , because changes to the class of the
simplified value may require us to invalidate any decision based on
that value.
This is done by adding such values as additional users, however the
current code does not excludes temporary instructions.
At the moment, this means that we miss those dependencies for
phi-of-ops, because they are temporary instructions at this point. We
instead need to add the extra dependencies to the root instruction of
the phi-of-ops.
This patch pushes the responsibility of adding extra users to the
callers of createExpression & performSymbolicEvaluation. At those
points, it is clearer which real instruction to pick.
Alternatively we could either pass the 'real' instruction as additional
argument or use another map, but I think the approach in the patch makes
things a bit easier to follow.
Fixes PR35074.
Reviewed By: asbirlea
Differential Revision: https://reviews.llvm.org/D99987
This commit adds support for dimension permutations in permutation maps of vector transfer ops.
Differential Revision: https://reviews.llvm.org/D101007
Fix style/clang-tidy warning, trim stale includes and forward
declarations, and cleanup/fix stale comments.
Differential Revision: https://reviews.llvm.org/D101021
Strided 1D vector transfer ops are 1D transfers operating on a memref dimension different from the last one. Such transfer ops do not accesses contiguous memory blocks (vectors), but access memory in a strided fashion. In the absence of a mask, strided 1D vector transfer ops can also be lowered using matrix.column.major.* LLVM instructions (in a later commit).
Subsequent commits will extend the pass to handle the remaining missing permutation maps (broadcasts, transposes, etc.).
Differential Revision: https://reviews.llvm.org/D100946
Make following function return void:
addLabel()
addSectionLabel()
addSectionDelta()
This aligns with other attributes adding functions.
Reviewed By: dblaikie
Differential Revision: https://reviews.llvm.org/D101022
ConstantFoldingMIRBuilder was an experiment which is not used for
anything. The constant folding functionality is now part of
CSEMIRBuilder.
Differential Revision: https://reviews.llvm.org/D101050
EMITBKEY is emitted for PAC-RET+bkey, which is a non machine instructions.
PR: 49957
Reviewed By: eugenis
Differential Revision: https://reviews.llvm.org/D100996
When building preamble, clangd truncates file contents. This yielded
errnous warnings in some cases.
This patch fixes the issue by turning off no-newline-at-eof warnings whenever
the file has more contents than the preamble.
Fixes https://github.com/clangd/clangd/issues/744.
Differential Revision: https://reviews.llvm.org/D100501
Fixes PR47627
This fix suppresses rerolling a loop which has an unrerollable
instruction.
Sample IR for the explanation below:
```
define void @foo([2 x i32]* nocapture %a) {
entry:
br label %loop
loop:
; base instruction
%indvar = phi i64 [ 0, %entry ], [ %indvar.next, %loop ]
; unrerollable instructions
%stptrx = getelementptr inbounds [2 x i32], [2 x i32]* %a, i64 %indvar, i64 0
store i32 999, i32* %stptrx, align 4
; extra simple arithmetic operations, used by root instructions
%plus20 = add nuw nsw i64 %indvar, 20
%plus10 = add nuw nsw i64 %indvar, 10
; root instruction 0
%ldptr0 = getelementptr inbounds [2 x i32], [2 x i32]* %a, i64 %plus20, i64 0
%value0 = load i32, i32* %ldptr0, align 4
%stptr0 = getelementptr inbounds [2 x i32], [2 x i32]* %a, i64 %plus10, i64 0
store i32 %value0, i32* %stptr0, align 4
; root instruction 1
%ldptr1 = getelementptr inbounds [2 x i32], [2 x i32]* %a, i64 %plus20, i64 1
%value1 = load i32, i32* %ldptr1, align 4
%stptr1 = getelementptr inbounds [2 x i32], [2 x i32]* %a, i64 %plus10, i64 1
store i32 %value1, i32* %stptr1, align 4
; loop-increment and latch
%indvar.next = add nuw nsw i64 %indvar, 1
%exitcond = icmp eq i64 %indvar.next, 5
br i1 %exitcond, label %exit, label %loop
exit:
ret void
}
```
In the loop rerolling pass, `%indvar` and `%indvar.next` are appended
to the `LoopIncs` vector in the `LoopReroll::DAGRootTracker::findRoots`
function.
Before this fix, two instructions with `unrerollable instructions`
comment above are marked as `IL_All` at the end of the
`LoopReroll::DAGRootTracker::collectUsedInstructions` function,
as well as instructions with `extra simple arithmetic operations`
comment and `loop-increment and latch` comment. It is incorrect
because `IL_All` means that the instruction should be executed in all
iterations of the rerolled loop but the `store` instruction should
not.
This fix rejects instructions which may have side effects and don't
belong to def-use chains of any root instructions and reductions.
See https://bugs.llvm.org/show_bug.cgi?id=47627 for more information.
The previous condition in the assert was over strict. We ought to allow
the same immidiate value being loaded more than once. The intention for
the assert is to check the same AMX register uses multiple different
immidiate shapes. So this fix supposes to be NFC.
Reviewed By: LuoYuanke
Differential Revision: https://reviews.llvm.org/D101124
We request no intersections between AMX instructions and their shapes'
def when we insert ldtilecfg. However, this is not always ture resulting
from not only users don't follow AMX API model, but also optimizations.
This patch adds a mechanism that tries to hoist AMX shapes' def as well.
It only hoists shapes inside a BB, we can improve it for cases across
BBs in future. Currently, it only hoists shapes of which all sources' def
above the first AMX instruction. We can improve for the case that only
source that moves an immediate value to a register below AMX instruction.
Differential Revision: https://reviews.llvm.org/D101067
Add __uintr_frame structure and use UIRET instruction for functions with
x86 interrupt calling convention when UINTR is present.
Reviewed By: LuoYuanke
Differential Revision: https://reviews.llvm.org/D99708
This is mostly NFC except that for end of BB not previous slot is used.
Idx is used to find a def of sibling live interval in that slot.
The def on end of MBB and on previous slot of end MBB should be the same,
so it should be NFC.
Reviewers: reames, qcolombet, MatzeB, wmi, rnk
Reviewed By: rnk
Subscribers: llvm-commits
Differential Revision: https://reviews.llvm.org/D100922
I first had a more invasive patch for this (D101069), but while trying
to get that polished for review I realized that lld's current symbol
merging semantics mean that only a very small code change is needed.
So this goes with the smaller patch for now.
This has no effect on projects that build with -fvisibility=hidden
(e.g. chromium), since these see .private_extern symbols instead.
It does have an effect on projects that build with -fvisibility-inlines-hidden
(e.g. llvm) in -O2 builds, where LLVM's GlobalOpt pass will promote most inline
functions from .weak_definition to .weak_def_can_be_hidden.
Before this patch:
% ls -l out/gn/bin/clang out/gn/lib/libclang.dylib
-rwxr-xr-x 1 thakis staff 113059936 Apr 22 11:51 out/gn/bin/clang
-rwxr-xr-x 1 thakis staff 86370064 Apr 22 11:51 out/gn/lib/libclang.dylib
% out/gn/bin/llvm-objdump --macho --weak-bind out/gn/bin/clang | wc -l
8291
% out/gn/bin/llvm-objdump --macho --weak-bind out/gn/lib/libclang.dylib | wc -l
5698
With this patch:
% ls -l out/gn/bin/clang out/gn/lib/libclang.dylib
-rwxr-xr-x 1 thakis staff 111721096 Apr 22 11:55 out/gn/bin/clang
-rwxr-xr-x 1 thakis staff 85291208 Apr 22 11:55 out/gn/lib/libclang.dylib
thakis@MBP llvm-project % out/gn/bin/llvm-objdump --macho --weak-bind out/gn/bin/clang | wc -l
725
thakis@MBP llvm-project % out/gn/bin/llvm-objdump --macho --weak-bind out/gn/lib/libclang.dylib | wc -l
542
Linking clang becomes a tiny bit faster with this patch:
x 100 0.67263818 0.77847815 0.69430709 0.69877208 0.017715892
+ 100 0.67209601 0.73323393 0.68600798 0.68917346 0.012824377
Difference at 95.0% confidence
-0.00959861 +/- 0.00428661
-1.37364% +/- 0.613449%
(Student's t, pooled s = 0.0154648)
This only happens if lld with the patch and lld without the patch are both
linked with an lld with the patch or both linked with an lld without the patch
(...or with ld64). I accidentally linked the lld with the patch with an lld
without the patch and the other way round at first. In that setup, no
difference is found. That makese sense, since having fewer weak imports will
make the linked output a bit faster too. So not only does this make linking
binaries such as clang a bit faster (since fewer exports need to be written to
the export trie by lld), the linked output binary binary is also a bit faster
(since dyld needs to process fewer dynamic imports).
This also happens to fix the one `check-clang` failure when using lld as host
linker, but mostly for silly reasons: See crbug.com/1183336, mostly comment 26.
The real bug here is that c-index-test links all of LLVM both statically and
dynamically, which is an ODR violation. Things just happen to work with this
patch.
So after this patch, check-clang, check-lld, check-llvm all pass with lld as
host linker :)
Differential Revision: https://reviews.llvm.org/D101080
- __got is in --bind output, so print that too (makes the test
a bit stronger)
- WEAK_DEFINES, BINDS_TO_WEAK are in the mach-o header, so
--private-header is enough, no need for --all-headers
(makes the test a bit easier to work with when it fails)
Differential Revision: https://reviews.llvm.org/D101065