Currently, manpage for scan-build is installed as a program, with
permission of 755. This patch makes installation of scan-build.1 as
file, with 644 permission.
Patch by Manas.
Reviewed By: steakhal
Differential Revision: https://reviews.llvm.org/D120646
Minor rearrangment in the order of conversion patterns to identify
differences.
Reviewed By: clementval, schweitz
Differential Revision: https://reviews.llvm.org/D120721
Starting from Windows SDK for Windows 11 (10.0.22000.x), all the system
libraries (.lib files) contain a section with the '/<XFGHASHMAP>/' name.
This looks like the libraries are built with control flow guard enabled:
https://docs.microsoft.com/en-us/cpp/build/reference/guard-enable-control-flow-guard?view=msvc-170
To let the LLVM tools (llvm-ar, llvm-lib) work with these libraries,
this patch just skips the section offset check for sections with the
'/<XFGHASHMAP>/' name.
Closes: llvm/llvm-project#53814
Signed-off-by: Pavel Samolysov <pavel.samolysov@intel.com>
Reviewed By: jhenderson, thieta
Differential Revision: https://reviews.llvm.org/D120645
This patch adds support for recognising vector splats by peeking through bitcasts to vectors with smaller element types - if all the offset subelements are splats then the bitcasted vector is a splat as well.
We don't have great coverage for isSplatValue so I've made this pretty specific to the use case I'm trying to fix - regressions in some vXi64 vector shift by splat cases that 32-bit x86 doesn't recognise because the shift amount buildvector has been type legalised to v2Xi32.
We can add further support (floats, bitcast from larger element types, undef elements) when we have actual test coverage.
Differential Revision: https://reviews.llvm.org/D120553
For conditional branches, we know the value is i1 0 or i1 1 along
the outgoing edges. For switches we can apply exactly the same
optimization, just with the known values determined by the switch
cases.
In convertToThreeAddress handle VOP2 mac/fmac instructions with a
literal src0 operand, since these are prime candidates for
converting to madmk/fmamk.
Previously this would only happen if src0 (or src1) was a register
defined by a move-immediate instruction, but in many cases these
operands have already been folded because SIFoldOperands runs
before TwoAddressInstructionPass.
Differential Revision: https://reviews.llvm.org/D120736
Given that there is only one external user of Lexer::getLangOpts
we can remove getter entirely without much pain.
Differential Revision: https://reviews.llvm.org/D120404
This is part of the implementation of the dataflow analysis framework.
See "[RFC] A dataflow analysis framework for Clang AST" on cfe-dev.
Reviewed-by: ymandel, xazax.hun
Differential Revision: https://reviews.llvm.org/D120711
This aids debugging when working with possibly broken files,
instead of just flat out erroring out without telling what's wrong.
Differential Revision: https://reviews.llvm.org/D120679
As part of https://reviews.llvm.org/D119036
(506cf6dc04), `-DNOMINMAX` was
dropped from the Windows CI configurations, replaced with a
block with `_LIBCPP_PUSH_MACROS`, `#include <__undef_macros>`
and `_LIBCPP_POP_MACROS` (and
`ADDITIONAL_COMPILE_FLAGS: -DNOMINMAX` left in two tests).
However, this workaround breaks the running the libc++ tests
against a different C++ standard library than libc++, as those
macros and that header are libc++ internals.
Therefore, reinstate `-DNOMINMAX` for clang-cl configurations
and remove the libc++ specific bits in filesystem_test_helper.h.
Differential Revision: https://reviews.llvm.org/D120478
The original implementation of this used the presence of a ":" in the module
name as the key, but since we now generate modules with the correct kind, we
can just test that.
Differential Revision: https://reviews.llvm.org/D120764
`hip-openmp-compatible` flag treats hip and hipv4 offload kinds
as compatible with openmp offload kind while extracting code objects
from a heterogenous archive library. Vice versa is also considered
compatible if hip code was compiled with -fgpu-rdc.
This flag only relaxes compatibility criteria on `OffloadKind`,
rest of the components like `Triple` and `GPUArhc` still needs to
be compatible.
Reviewed By: yaxunl
Differential Revision: https://reviews.llvm.org/D120697
This patch added the MC layer support of Zfinx extension.
Authored-by: StephenFan
Co-Authored-by: Shao-Ce Sun
Reviewed By: asb
Differential Revision: https://reviews.llvm.org/D93298
This wraps up from D119053. The 2 headers are moved as described,
fixed file headers and include guards, updated all files where the old
paths were detected (simple grep through the repo), and `clang-format`-ed it all.
Differential Revision: https://reviews.llvm.org/D119876
This patch adds assemblyFormat for `omp.critical.declare`, `omp.atomic.read`,
`omp.atomic.write`, `omp.atomic.update` and `omp.atomic.capture`.
Also removing those clauses from `parseClauses` that aren't needed
anymore, thanks to the new assemblyFormats.
Reviewed By: NimishMishra, rriddle
Differential Revision: https://reviews.llvm.org/D120248
This change adds the selection of no-return buffer_* instructions in
tblgen. The motivation for this is to get the no-return atomic isel
working without relying on post-isel hooks so that GlobalISel can start
selecting them (once GlobalISelEmitter allows no return atomic patterns
like how DAGISel does).
This change handles the selection of no-return mubuf_atomic_cmpswap in
tblgen without changing the extract_subreg generation for the return
variant. This handling was done by the post-isel hook.
Differential Revision: https://reviews.llvm.org/D120538
I'm bring up the support of pseudo-probe-based non-CS profile generation. The approach is quite similar to generating dwarf-based non-CS profile. The main difference is for a given linear instruction range, instead of each disassembled instruction, pseudo probes that are covered by the range are processed. The pseudo probe extraction code is shared with CS probe profile generation.
I'm seeing 0.7% performance win for one of our internal large benchmark compared to using non-CS dwarf-based profile, and 0.5% win for another large benchmark when combined with profi.
Reviewed By: wenlei
Differential Revision: https://reviews.llvm.org/D120335
The priority-based inliner currenlty uses block count combined with callee entry count to drive callsite inlining. This doesn't work well with LTO where postlink inlining is driven by prelink-annotated block count which could be based on the merge of all context profiles. I'm fixing it by using callee profile entry count only which should be context-sensitive.
I'm seeing 0.2% perf improvment for one of our internal large benchmarks with probe-based non-CS profile.
Reviewed By: wenlei
Differential Revision: https://reviews.llvm.org/D120784
C++20 non-type template parameter prints `MyType<{{116, 104, 105, 115}}>` when the code is as simple as `MyType<"this">`. This patch prints `MyType<{"this"}>`, with one layer of braces preserved for the intermediate structural type to trigger CTAD.
`StringLiteral` handles this case, but `StringLiteral` inside `APValue` code looks like a circular dependency. The proposed patch implements a cheap strategy to emit string literals in diagnostic messages only when they are readable and fall back to integer sequences.
Reviewed By: aaron.ballman
Differential Revision: https://reviews.llvm.org/D115031
Without this, EPCIndirectionUtils::getResolverBlockAddr (and lazy compilation
via EPC) won't work.
No test case: lli is still using LocalLazyCallThroughManager. I'll revisit this
soon when I look at adding lazy compilation support to the ORC runtime.
sparsity values.
Previously, we can't properly handle input tensors with a dimension
ordering that is different from the natural ordering or with a mixed of
compressed and dense dimensions. This change fixes the problems by
passing the dimension ordering and sparsity values to the runtime
routine.
Modify an existing test to test the situation.
Reviewed By: aartbik
Differential Revision: https://reviews.llvm.org/D120777
Show the name of of the archive in the error message as well as the name
of the object within it.
Differential Revision: https://reviews.llvm.org/D120689
The standard explicitly allows a comma to be omitted between a 'P'
edit descriptor and a following numeric edit descriptor (e.g., 1PE10.1),
and before and after a '/' edit descriptor, but otherwise requires them
between edit descriptors. Most implementations, however, only require
commas where they prevent ambiguity, and accept things like 1XI10.
This extension is already assumed by the static FORMAT checker in
semantics. Patch the runtime to behave accordingly.
Differential Revision: https://reviews.llvm.org/D120747