`T` is not a valid identifier for libc++ to use, use `_Tp` instead. Caught from D116957
Reviewed By: Quuxplusone
Differential Revision: https://reviews.llvm.org/D117582
Matching an exact byte offset is fragile if a different version of compiler
is used (e.g. distro clang).
Resolves an issue with running with BOLT_CLANG_EXE + clang-12
Reviewed By: maksfb
Differential Revision: https://reviews.llvm.org/D117440
The `InstrProfInstBase` class is for all `llvm.instrprof.*` intrinsics. In a
later diff we will add new instrinsic of this type. Also refactor some
logic in `InstrProfiling.cpp`.
Reviewed By: davidxl
Differential Revision: https://reviews.llvm.org/D117261
This reverts the revert commit e328385739.
Accidental demanded bits change has been removed. The demanded bits
code itself was remove in a pre-commit since it isn't tested.
Original commit message:
Previous we used the fshl/fshr operand ordering for simplicity. This
made things confusing when D117468 proposed adding intrinsics for
the instructions. We can't just use the generic funnel shifting
intrinsics because fsl/fsr have different functionality that should
be exposed to software.
Now we use rs1, rs3, rs2/shamt order which matches the instruction
printing order and the order used in this intrinsic header
https://github.com/riscv/riscv-bitmanip/blob/main-history/cproofs/rvintrin.h
Testing may be easier after D117468. Right now we get demanded bits
optimizations done on ISD::FSHL/FSHR before they become FSR/FSL. This
makes it hard to test.
This fixes a verifier error I ran into at -O0. A subregister copy had
an implicit kill of an overlapping superregister, which was partially
redefined by the copy. The preserved implicit operand killed
subregisters made live earlier in the sequence. AMDGPU already uses
similar logic for whether to preserve the kill of the superregister on
the final instruction if there's overlap.
C17 deprecated ATOMIC_VAR_INIT with the resolution of DR 485. C++
followed suit when adopting P0883R2 for C++20, but additionally chose
to deprecate ATOMIC_FLAG_INIT at the same time despite the macro still
being required in C. This patch marks both macros as deprecated when
appropriate to do so.
The current state of the top level Analysis/ directory is that it contains two libraries;
a generic Analysis library (free from dialect dependencies), and a LoopAnalysis library
that contains various analysis utilities that originated from Affine loop transformations.
This commit moves the LoopAnalysis to the more appropriate home of `Dialect/Affine/Analysis/`,
given the use and intention of the majority of the code within it. After the move, if there
are generic utilities that would fit better in the top-level Analysis/ directory, we can move
them.
Differential Revision: https://reviews.llvm.org/D117351
The function `std::fill` requires a ForwardIterator, but `std::fill_n`
only requires an OutputIterator. Adds a test to validate `std::fill_n`
works with an OutputIterator.
Noticed this while working on LWG3539
format_to must not copy models of output_iterator<const charT&>
Reviewed By: #libc, Quuxplusone, ldionne
Differential Revision: https://reviews.llvm.org/D117395
A clean checkout of LLVM using core.autocrlf=on for Windows cannot pass
the LLVM test suite because several test input files rely on specific
line endings.
This change updates the line ending attributes for impacted tests and
re-normalizes a CRLF test case that was committed as LF to the git
index.
After the changes in D117362 made variables declared inside of a target
declare directive visible outside the plugin, some variables inside the
runtime were given visiblity that conflicted with their address space
type. This caused problems when shared or local memory was made
externally visible. This patch fixes this issue by making these
varialbes static within the module, therefore limiting their visibility
to being internal.
Reviewed By: jdoerfert
Differential Revision: https://reviews.llvm.org/D117526
This patch changes the visiblity of variables declared within a declare
target directive. Variable declarations within a declare target
directive need to be externally visible to the plugin for initialization
or reading. Previously this would cause runtime errors where the named
global could not be found because it was not included in the symbol
table.
Reviewed By: jdoerfert
Differential Revision: https://reviews.llvm.org/D117362
This patch changes the visibility of the `__omp_rtl_debug_kind` variable
to be hidden. These variables are only used by the plugin so they do not
need to be read externally. Previously the default visibility prevented
these variables from being completely eliminated in the module.
Reviewed By: tianshilei1992
Differential Revision: https://reviews.llvm.org/D117320
Previous we used the fshl/fshr operand ordering for simplicity. This
made things confusing when D117468 proposed adding intrinsics for
the instructions. We can't just use the generic funnel shifting
intrinsics because fsl/fsr have different functionality that should
be exposed to software.
Now we use rs1, rs3, rs2/shamt order which matches the instruction
printing order and the order used in this intrinsic header
https://github.com/riscv/riscv-bitmanip/blob/main-history/cproofs/rvintrin.h
The global state refers to the number of the nodes currently in the
module, and the number of direct calls between nodes, across the
module.
Node counts are not a problem; edge counts are because we want strictly
the kind of edges that affect inlining (direct calls), and that is not
easily obtainable without iteration over the whole module.
This patch avoids relying on analysis invalidation because it turned out
to be too aggressive in some cases. It leverages the fact that Node
objects are stable - they do not get deleted while cgscc passes are
run over the module; and cgscc pass manager invariants.
Reviewed By: aeubanks
Differential Revision: https://reviews.llvm.org/D115847
This patch fixes a case where the 'align' parameter attribute on the
pointer operands to llvm.vp.gather and llvm.vp.scatter was being dropped
during the conversion to the SelectionDAG. The default alignment equal
to the ABI type alignment of the vector type was kept. It also updates
the documentation to reflect the fact that the parameter attribute is
now properly supported.
The default alignment of these intrinsics was previously documented as
being equal to the ABI alignment of the *scalar* type, when in fact that
wasn't the case: the ABI alignment of the vector type was used instead.
This has also been fixed in this patch.
Reviewed By: simoll, craig.topper
Differential Revision: https://reviews.llvm.org/D114423
- Add current iteration to the context of a DWARF expression evaluation.
- Add DW_AT_LLVM_iterations attribute to specify the number of
iterations executing concurrently.
- Add DF_OP_LLVM_push_iteration to support optimizations that result in
multiple iterations executing concurrently.
- Add DW_OP_LLVM_overlay and DW_OP_LLVM_bit_overlay to support
expressing the location of arrays that are promoted to vector
registers in SIMD vectorized loops.
- Generally clarify the difference between SIMT and SIMD execution.
- Change the DW_AT_LLVM_active_lane attribute to take location
description expression so that a loclist can be used to express
different vales at different program locations.
Reviewed By: scott.linder
Differential Revision: https://reviews.llvm.org/D117572
Instead of storing the wrapped iterator inside the stride_counting_iterator,
store its base so we can have e.g. a stride_counting_iterator of an
input_iterator (which was previously impossible because input_iterators
are not copyable). Also a few other simplifications in stride_counting_iterator.
As a fly-by fix, remove the member base() functions, which are super
confusing.
Differential Revision: https://reviews.llvm.org/D116613
This change is the basis for a further refactoring where I'm going to
split up the various implementations we have in __threading_support to
make that code easier to understand.
Note that I had to make __convert_to_timespec a template to break
circular dependencies. Concretely, we never seem to use it with anything
other than ::timespec, but I am wary of hardcoding that assumption as
part of this change, since I suspect there's a reason for going through
these hoops in the first place.
Differential Revision: https://reviews.llvm.org/D116944
LLVM Programmer’s Manual strongly discourages the use of `std::vector<bool>` and suggests `llvm::BitVector` as a possible replacement.
This patch does just that for llvm.
Reviewed By: dexonsmith
Differential Revision: https://reviews.llvm.org/D117121
This was noted as a potential cleanup in D117508.
getShiftAmountTy() has checks for vector, phase, etc. so it should
handle anything that the caller was trying to account for.
Summary:
The patch is based on the EGuesnet's implement of the "Support of Big archive (read)
the first commit of the patch is come from https://reviews.llvm.org/D100651.
the rest of commits of the patch
1 Addressed the comments on the https://reviews.llvm.org/D100651
2 according to https://www.ibm.com/docs/en/aix/7.2?topic=formats-ar-file-format-big
using the "fl_fstmoff" for the first object file number, using "char ar_nxtmem[20]" to get next object file ,
using the "char fl_lstmoff[20]" for the last of the object file will fix the following problems:
2.1 can not correct reading a archive files which has padding data between too object file
2.2 can not correct reading a archive files from which some object file has be deleted
3 introduce a new derived class BigArchive for big ar file.
Reviewers: James Henderson
Differential Revision: https://reviews.llvm.org/D111889
LLVM Programmer’s Manual strongly discourages the use of `std::vector<bool>` and suggests `llvm::BitVector` as a possible replacement.
This patch does just that for clangd and clang-tidy.
Reviewed By: dexonsmith
Differential Revision: https://reviews.llvm.org/D117119
The parsing code for a typename requirement currently asserts when
given something which is not a valid type-requirement
(http://eel.is/c++draft/expr.prim.req.type#nt:type-requirement). This
removes the assertion to continue on to the proper diagnostic.
This resolves PR53057.
Note that in that PR, it is using _BitInt(N) as a dependent type name.
This patch does not attempt to support that as it is not clear that is
a valid type requirement (it does not match the grammar production for
one). The workaround in the PR, however, is definitely valid and works
as expected.
The leading space that is always printed at the beginning of regions is not consistent with other parts of the printing API. Moreover, this leading space can lead to undesirable assembly formats:
```
attr-dict-with-keyword $region
```
Prints as:
```
// Two spaces between `}` and `{`
attributes {foo} { ... }
```
Moreover, the leading space results in the odd generic op format:
```
"test.op"() ( {...}) : () -> ()
```
Reviewed By: rriddle, mehdi_amini
Differential Revision: https://reviews.llvm.org/D117411
This casued a miscompile due to call slot optimization replacing a call
argument without considering the call's !noalias metadata, see discussion on
the code review.
> Call slot optimization is currently supposed to be prevented if
> the call can capture the source pointer. Due to an implementation
> bug, this check currently doesn't trigger if a bitcast of the source
> pointer is passed instead. I'm somewhat afraid of the fallout of
> fixing this bug (due to heavy reliance on call slot optimization
> in rust), so I'd like to strengthen the capture reasoning a bit first.
>
> In particular, I believe that the capture is fine as long as a)
> the call itself cannot depend on the pointer identity, because
> neither dest has been captured before/at nor src before the
> call and b) there is no potential use of the captured pointer
> before the lifetime of the source alloca ends, either due to
> lifetime.end or a return from a function. At that point the
> potentially captured pointer becomes dangling.
>
> Differential Revision: https://reviews.llvm.org/D115615
Also reverting the dependent commit:
> [MemCpyOpt] Look through pointer casts when checking capture
>
> The user scanning loop above looks through pointer casts, so we
> also need to strip pointer casts in the capture check. Previously
> the source was incorrectly considered not captured if a bitcast
> was passed to the call.
This reverts commit 487a34ed9d
and 00e6869463.
The recommendation on Windows is to checkout from git with
core.autolf=false in order to preserve LF line endings on
test files. However, when creating a new check this results
in modified files as having switched all the line endings on
Windows. Write all files with explicit LF line endings to
prevent this.
Fixes#52968
Differential Revision: https://reviews.llvm.org/D117535
ELF object files can contain duplicated sections (thus section symbols
as well), espeically when comdats/section groups are present. This patch
adds support for generating LinkGraph from object files that have
duplicated section names. This is the first step to properly model
comdats/section groups.
Reviewed By: lhames
Differential Revision: https://reviews.llvm.org/D114753
Fix the compatibility of Optional<> with some GCC versions that it will fail
to compile when T is getting checked for `is_trivially_move_constructible`
as mentioned here: https://reviews.llvm.org/D93510#2538983
Fix the problem by using `llvm::is_trivially_move_constructible`.
Reviewed By: jplayer-nv, tatyana-krasnukha
Differential Revision: https://reviews.llvm.org/D117254
Add use of TPAUSE (from WAITPKG) to the runtime for Intel hardware,
with an envirable to turn it on in a particular C-state. Always uses
TPAUSE if it is selected and enabled by Intel hardware and presence of
WAITPKG, and if not, falls back to old way of checking
__kmp_use_yield, etc.
Differential Revision: https://reviews.llvm.org/D115758
The translation from the MLIR LLVM dialect to LLVM IR includes a mechanism that
ensures the successors of a block to be different blocks in case block
arguments are passed to them since the opposite cannot be expressed in LLVM IR.
This mechanism previously only worked for functions because it was written
prior to the introduction of other region-carrying operations such as the
OpenMP dialect, which also translates directly to LLVM IR. Modify this
mechanism to handle all regions in the module and not only functions.
Reviewed By: wsmoses
Differential Revision: https://reviews.llvm.org/D117548
Just replacing std::map with DenseMap here is a major regression
-- because this code used an identity hash for ValueIDNum.
Because ValueIDNum is composed of multiple components, it is
important that we use a reasonably good hash function here, so
switch it to hash_value. DenseMapInfo::getHashValue<uint64_t>
would not be sufficient.
This gives a -0.8% geomean improvement on CTMark ReleaseLTO-g.
When invoking Lit repeatedly, we perform all the configuration checks
over and over again, which takes a lot of time. This patch allows caching
the result of configuration checks persistently across Lit invocations to
speed this up.
In theory, this should still be functionally correct since the cache
key should contain everything that determines the output of the
configuration check. However, in cases where e.g. the compiler has
changed but is at the same path as previously, the Lit configuration
checks will be cached even though technically the cache should have
been invalidated.
Differential Revision: https://reviews.llvm.org/D117361
The FIXME is no longer relevant as ControlFlowContext centralizes the
construction of the CFG.
This is part of the implementation of the dataflow analysis framework.
See "[RFC] A dataflow analysis framework for Clang AST" on cfe-dev.
Reviewed-by: xazax.hun
Differential Revision: https://reviews.llvm.org/D117563