Remove duplicate `Pass` suffix from view-op-graph pass class name. The
extra suffix would lead to methods like registerViewOpGraphPassPass
being generated.
Differential Revision: https://reviews.llvm.org/D114459
Also replace ArrayAttr with IndexElementsAttr to model subscript dimensions.
An array of attribute is a sparse inefficient storage, with an API that
requires to unpack/repack integers at every call site.
Instead we can store dense array of integer as IndexElementsAttr.
Reviewed By: clementval, kiranchandramohan
Differential Revision: https://reviews.llvm.org/D112899
I don't see a reason why not to. If we allows lookup functions by full names,
I can change the test case in D113930 to use `lldb-test symbols --find=function --name=full::name --function-flags=full ...`,
though the duplicate method decl prolem is still there for `lldb-test symbols --dump-ast`.
That's a seprate bug, we can fix it later.
Differential Revision: https://reviews.llvm.org/D114467
Nothing uses more than 8bit now. So the rest of the headers can store other data.
kStackTraceMax is 256 now, but all sanitizers by default store just 20-30 frames here.
Verified no diff exist between previous version, new version python 2, and python 3 for an example stack.
Reviewed By: eugenis
Differential Revision: https://reviews.llvm.org/D114404
Similarly to what GCC does, we should allow scalars with
the "v" constraint rather than introducing unnecessary
new constraints for scalars in Altivec registers.
Differential revision: https://reviews.llvm.org/D113635
AMDGPU is unusual in that the both stack is indexed in the same
direction as stack growth (up). We therefore always need the emergency
stack slots placed as low as possible to ensure they are in range of
load/store instruction immediate offsets. The existing logic is mostly
OK, but failed if we required stack realignment.
I don't understand what the existing control isFPCloseToIncomingSP is
supposed to mean, but can only be used to stop placing the scavenge
slots earlier. Make this explicit so that targets can opt-in rather
than opt-out only.
This diff is adding the capping_size determination for the list and forward list, to limit the number of children to be displayed. Also it modifies and unifies tests for libcxx and libstdcpp list data formatter.
Reviewed By: wallace
Differential Revision: https://reviews.llvm.org/D114433
This diff is avoiding the size limitation introduced by the capping size for the libcxx and libcpp bitset data formatters.
Reviewed By: wallace
Differential Revision: https://reviews.llvm.org/D114461
We need to add checks that ensure that some core variables are valid, so
that we avoid printing out garbage data. The worst that could happen is
that an non-initialized variable is being printed as something with
123123432 children instead of 0.
Differential Revision: https://reviews.llvm.org/D114458
As suggested by @labath in https://reviews.llvm.org/D114403, we should
make the formatter more resilient to corrupted data. The Libcxx version
explicitly checks for engaged = 1, so we can do that as well for safety.
Differential Revision: https://reviews.llvm.org/D114450
(a & b) ^ (~a | b) --> ~a
I was looking for a shortcut to reduce some of the complex logic
folds that are currently up for review (D113216
and others in that stack), and I found this missing from
instcombine/instsimplify.
There is a trade-off in putting it into instsimplify: because
we can't create new values here, we need a strict 'not' op (no
undef elements). Otherwise, the fold is not valid:
https://alive2.llvm.org/ce/z/k_AGGj
If this was in instcombine instead, we could create the proper
'not'. But having the fold here benefits other passes like GVN
that use instsimplify as an analysis.
There is a related fold where 'and' and 'or' are swapped, and
that is planned as a follow-up commit.
Differential Revision: https://reviews.llvm.org/D114462
The MIR sample loader changes the branch probability but not BFI.
Here we force a recompute of BFI if the branch probabilities are
changed.
Also register the MIR FSAFDO passes properly.
Differential Revision: https://reviews.llvm.org/D114400
Configuring lldb with `LLDB_ENABLE_PYTHON=OFF` and `LLDB_ENABLE_LUA=ON` results in a CMake error:
CMake Error at lldb/bindings/lua/CMakeLists.txt:47 (create_relative_symlink):
Unknown CMake command "create_relative_symlink".
Call Stack (most recent call first):
lldb/CMakeLists.txt:117 (finish_swig_lua)
This is because the CMake function `create_relative_symlink` only exists in `lldb/bindings/python/CMakeLists.txt`, and not in `lldb/bindings/lua/CMakeLists.txt`.
Move the function to `lldb/bindings/CMakeLists.txt`, so it is available for all language bindings.
Reviewed By: labath
Differential Revision: https://reviews.llvm.org/D114465
Padding now can explicitly specify the padding value when non-zero is wanted.
This also includes bypassing pads when the pad does nothing.
Differential Revision: https://reviews.llvm.org/D113611
Transpose convolution decomposition is now performed in a separate pass. This
allows padding / constant propagation to be performed at the TOSA level. It
also adds support for striding when there is no dilation.
Differential Revision: https://reviews.llvm.org/D114409
[NFC] As part of using inclusive language within the llvm project, this patch
replaces master with primary in `LiveRangeUtils.h`.
Reviewed By: MatzeB
Differential Revision: https://reviews.llvm.org/D114191
The benefits of sampling-based PGO crucially depends on the quality of profile
data. This diff implements a flow-based algorithm, called profi, that helps to
overcome the inaccuracies in a profile after it is collected.
Profi is an extended and significantly re-engineered classic MCMF (min-cost
max-flow) approach suggested by Levin, Newman, and Haber [2008, Complementing
missing and inaccurate profiling using a minimum cost circulation algorithm]. It
models profile inference as an optimization problem on a control-flow graph with
the objectives and constraints capturing the desired properties of profile data.
Three important challenges that are being solved by profi:
- "fixing" errors in profiles caused by sampling;
- converting basic block counts to edge frequencies (branch probabilities);
- dealing with "dangling" blocks having no samples in the profile.
The main implementation (and required docs) are in SampleProfileInference.cpp.
The worst-time complexity is quadratic in the number of blocks in a function,
O(|V|^2). However a careful engineering and extensive evaluation shows that
the running time is (slightly) super-linear. In particular, instances with
1000 blocks are solved within 0.1 second.
The algorithm has been extensively tested internally on prod workloads,
significantly improving the quality of generated profile data and providing
speedups in the range from 0% to 5%. For "smaller" benchmarks (SPEC06/17), it
generally improves the performance (with a few outliers) but extra work in
the compiler might be needed to re-tune existing optimization passes relying on
profile counts.
Reviewed By: wenlei, hoy
Differential Revision: https://reviews.llvm.org/D109860
The current TLSDESC optimization code assumes:
```
leaq x@tlsdesc(%rip), %rax
call *x@tlscall(%rax) # adjacent
```
From https://gitlab.freedesktop.org/mesa/mesa/-/issues/5665 , it seems that the
two instructions may not be adjacent in GCC 10's output:
```
leaq x@tlsdesc(%rip), %rax
something else
call *x@tlscall(%rax)
```
This patch supports the case. While here, support non-RAX registers for
R_X86_64_GOTPC32_TLSDESC, in case the compiler generates inefficient:
```
leaq x@tlsdesc(%rip), %rcx # or %rdx, %rbx, %rdi, ...
movq %rcx, %rax
call *x@tlscall(%rax) # GNU ld/gold error for non-RAX
```
Differential Revision: https://reviews.llvm.org/D114416
The INSTR_PROF_RAW_MAGIC_* number in profraw files should match during
profile merging. This causes an error with 32-bit and 64-bit variants
of the same code. The module signatures for the two binaries are
identical but they use different INSTR_PROF_RAW_MAGIC_* causing a
failure when profile-merging is used. Including it when computing the
module signature yields different signatures for the 32-bit and 64-bit
profiles.
Differential Revision: https://reviews.llvm.org/D114054
Support for builtins that use bcdadd./bcdsub. to add/subtract
Binary Coded Decimal values as well as to determine validity
and compare BCD values.
Differential revision: https://reviews.llvm.org/D114088
These variables are not out-params, and we immediately return after assigning them. Thus, the assignments are dead and just confusing.
I believe these used to be out-params, but they're not any more.