This reverts commit 620d99b7ed.
Let's see if removing the two offending RUN lines makes this patch pass.
Not ideal to drop tests but, it's just a debugging feature, probably not
that important.
We should not assume that the cet.h header exists just because
we're on x86 linux. Only include it if __CET__ is defined. This
makes the code more similar to what compiler-rt does in
ee423d93ea/compiler-rt/lib/builtins/assembly.h (L17)
(though that one also has a __has_include() check -- I've not found
that to be necessary).
Differential Revision: https://reviews.llvm.org/D119697
This is a followup to D119695 using the suggestion by joerg. Rather
than manually declaring madvise() on __sun__, this uses
posix_madvise() if available, which does get declared properly on
Illumos.
Differential Revision: https://reviews.llvm.org/D119856
These tests are dumped without optimization, which makes them too
lengthy and contain meaningless load/stores. Clean them up to prepare
for future headers update.
Commit 4a794d848c caused build failure with -Werror -Wpessimizing-move on the clang-ppc64-aix buildbot. This patch applies Clang's suggestion to remove `std::move`.
The goal is support tail and mask policy in RVV builtins.
We focus on IR part first.
If the passthru operand is undef, we use tail agnostic, otherwise
use tail undisturbed.
My plan is to handle more complex operations in follow-up patches.
Reviewers: frasercrmck
Differential Revision: https://reviews.llvm.org/D118253
This one tries to fix:
https://github.com/llvm/llvm-project/issues/53357.
Simply, this one would check (x & y) and ~(x | y) in
haveNoCommonBitsSet. Since they shouldn't have common bits (we could
traverse the case by enumerating), and we could convert this one to (x &
y) | ~(x | y) . Then the compiler could handle it in
InstCombineAndOrXor.
Further more, since ((x & y) + (~x & ~y)) would be converted to ((x & y)
+ ~(x | y)), this patch would fix it too.
https://alive2.llvm.org/ce/z/qsKzRS
Reviewed By: spatel, xbolva00, RKSimon, lebedev.ri
Differential Revision: https://reviews.llvm.org/D118094
This patch tries to implement RVO for coroutine's return object got from
get_return_object.
From [dcl.fct.def.coroutine]/p7 we could know that the return value of
get_return_object is either a reference or a prvalue. So it makes sense
to do copy elision for the return value. The return object should be
constructed directly into the storage where they would otherwise be
copied/moved to.
Test Plan: folly, check-all
Reviewed By: junparser
Differential revision: https://reviews.llvm.org/D117087
This gives a approximate error location. Although not very
accurate, it suffices to debug.
Reviewed By: myhsu
Differential Revision: https://reviews.llvm.org/D119684
Refactor the instructions in M68kInstrControl.td to use the VarLenCodeEmitter.
This patch is tested by the existing test cases.
Reviewed By: myhsu, ricky26
Differential Revision: https://reviews.llvm.org/D119665
I was looking at Stream::PutRawBytes and thought I spotted a bug because
both loops are using `i < src_len` as the loop condition despite them
iterating in opposite directions.
On closer inspection, the existing code is correct, because it relies on
well-defined unsigned integer wrapping. Correct doesn't mean readable,
so this patch changes the loop condition to compare against 0 when
decrementing i while still covering the edge case of src_len potentially
being 0 itself.
Differential revision: https://reviews.llvm.org/D119857
Volatile store does not provide any special rules for reordering with
atomics. Usual must alias anaylsis is enough here.
This makes the bahavior similar to how volatile load is handled.
Reviewers: reames, nikic
Reviewed By: reames
Subscribers: llvm-commits
Differential Revision: https://reviews.llvm.org/D119818
The prior page was the proposal doc, this one is now
more about what the project intends to do, written in the
present tense.
Reviewed By: sivachandra
Differential Revision: https://reviews.llvm.org/D119379
For AMDGPU the insertion point for a block may not be the first
non-PHI instruction. This happens when a block contains EXEC
mask manipulation related to control flow (converging lanes).
Use SkipPHIsAndLabels to determine the block insertion point
so that the target can skip any block prologue instructions.
Reviewed By: rampitec, ruiling
Differential Revision: https://reviews.llvm.org/D119399
This implements a basic arm64 loader for Linux, and all the currently
enabled linker tests pass. TLS is not implemented, and functions
using it will have undefined behaviour. Notably, the TLS test is
currently disabled on x86_64.
Much of the structure is copied from x86_64 to allow for a refactoring
of the start code between architectures.
Tested:
ninja libc_loader_tests on aarch64-linux.
Co-authored-by: Raman Tenneti <rtenneti@google.com>
Reviewed By: sivachandra
Differential Revision: https://reviews.llvm.org/D119641
This reverts commit 25cdf87b13.
125abb61f7 reverted the patch.
It is incorrect to update this file when a flagless warning is added.
This test exists to make sure _all_ warnings are behind a flag.
The right fix is to put the new warning in a warning group (so that
it can be toggled with a flag), not to update the list here.
Fusion of `linalg.generic` with
`tensor.expand_shape/tensor.collapse_shape` currently handles fusion
with reshape by expanding the dimensionality of the `linalg.generic`
operation. This helps fuse elementwise operations better since they
are fused at the highest dimensionality while keeping all indexing
maps involved projected permutations. The intent of these is to push
the reshape to the boundaries of functions.
The presence of named ops (or other ops across which the reshape
cannot be propagated) stops the propagation to the edges of the
function. At this stage, the converse patterns that fold the reshapes
with generic ops by collapsing the dimensions of the generic op can
push the reshape towards edges. In particular it helps the case where
reshapes exist in between named ops and generic ops.
`linalg.named_op` -> `tensor.expand_shape` -> `linalg.generic`
Pushing the reshape down will help fusion of `linalg.named_op` ->
`linalg.generic` using tile + fuse transformations.
This pattern is intended to replace the following patterns
1) FoldReshapeByLinearization : These patterns create indexing maps
that are not projected permutations that affect future
transformations. They are only useful for folding unit-dimensions.
2) PushReshapeByExpansion : This pattern has the same functionality
but has some restrictions
a) It tries to avoid creating new reshapes that limits its
applicability. The pattern added here can achieve the same
functionality through use of the `controlFn` that allows clients
of the pattern freedom to make this decision.
b) It does not work for ops with indexing semantics.
These patterns will be deprecated in a future patch.
Differential Revision: https://reviews.llvm.org/D119365
We only need to do propagation on use instructions of the original
value, rather than the replacing const value which might have lots
of irrelavant uses. This is done by caching uses before replacing.
Differential Revision: https://reviews.llvm.org/D119815
The goal is support tail and mask policy in RVV builtins.
We focus on IR part first.
If the passthru operand is undef, we use tail agnostic, otherwise
use tail undisturbed.
Add passthru operand for VSLIDE1UP_VL and VSLIDE1DOWN_VL to support
i64 scalar in rv32.
The masked VSLIDE1 would only emit mask undisturbed policy regardless
of giving mask agnostic policy until InsertVSETVLI supports mask agnostic.
Reviewed by: craig.topper, rogfer01
Differential Revision: https://reviews.llvm.org/D117989
`parseSections()` is a getting a bit large unwieldy, let's factor out
logic where we can.
Other minor changes in this diff:
* `"__cg_profile"` is now a global constexpr
* We now use `checkError()` instead of `fatal()`-ing without handling
the Error
* Check for `callGraphProfileSort` before checking the section name,
since the boolean comparison is likely cheaper
Reviewed By: #lld-macho, lgrey, oontvoo
Differential Revision: https://reviews.llvm.org/D119892
Added ability to append new entries to DIE. This is useful to standadize DWARF4
Split Dwarf, and simplify implementation of DWARF5.
Multiple DIEs can share an abbrev. So currently limitation is that only unique
Attributes can be added.
Reviewed By: maksfb
Differential Revision: https://reviews.llvm.org/D119577
This change exposes printer flags in AsmState and AsmStateImpl. All functions
receiving AsmState as a parameter now use the flags from the AsmState instead of
taking an additional OpPrintingFlags parameter.
Reviewed By: rriddle
Differential Revision: https://reviews.llvm.org/D119870
Symbols with regular GOT entries do need to be exported, but those that
are internalized (and have dymmy/internal GOT entries) need not be
exported.
This happens to fix the failures on the emscripten waterfall where extra
symbols were being exported by the linker (and then later removed by
wasm-opt).
Differential Revision: https://reviews.llvm.org/D119902
The changes from the One Ranges Proposal amount to adding:
- a constructor that takes a `default_sentinel_t` and is equivalent to
the default constructor;
- an `operator==` that compares the iterator to `default_sentinel_t`.
The original proposal defined two overloads for `operator==` (different
argument order) as well as `operator!=`. This has been removed by
[P1614](https://wg21.link/p1614).
Differential Revision: https://reviews.llvm.org/D119620
Previously, building LLVM-libc with GWP ASAN was conditioned on the flag
COMPILER_RT_BUILD_GWP_ASAN, which caused issues do to the default value
of the flag being set in the compiler-rt cmake, which is seperate. Now
GWP ASAN is included based on if it exists as a target, which is more
consistent.
Reviewed By: sivachandra
Differential Revision: https://reviews.llvm.org/D119789
I'm trying to get libc++ to the point of being able to run clang-tidy. This is a PR to see if clang-tidy is happy with all the CI configs.
Reviewed By: Quuxplusone, ldionne, #libc
Spies: mgorny, aheejin, libcxx-commits, arichardson
Differential Revision: https://reviews.llvm.org/D117174
The tablegen lines that specify the XPLINK64 calling convention for promoting an f32 vararg to an f64 are effectively overwritten by the following tablegen line which bitcast an f64 vararg to an i64 (so that it can be used in the GPRs). It becomes a bitcast from f32 to i64.
Since we don't handle a bitcast for f32s this caused an assertion.