Temporarily revert "tsan: fix leak of ThreadSignalContext for fibers"
because it breaks the LLDB bot on GreenDragon.
This reverts commit 93f7743851.
This reverts commit d8a0f76de7.
AMDGPUPropagateAttributes pass was skipping some of the functions
when cloning. Functions were added to root set and then skipped
on the next interation because they are already in the root set,
while were meant to be processed with different features.
Differential Revision: https://reviews.llvm.org/D76815
This particular debug shows all the time when I'm processing any MLIR
module:
discovered a new reachable node ^bb0
discovered a new reachable node ^bb0
discovered a new reachable node ^bb0
discovered a new reachable node ^bb0
...
(repeated x1875)
I think that printing all the basic blocks in the function is likely low
value enough that we can get away with removing this.
Differential Revision: https://reviews.llvm.org/D76813
We have several test directories whose names end in .spec. Ensure we
don't accidentally ignore them by removing a section of the .gitignore
that's irrelevant for libc++.
Differential Revision: https://reviews.llvm.org/D69195
Summary:
The test performs add on vector<16384xf32> with
number of workgroups = (128, 1, 1)
local workgroup size = (128, 1, 1)
On a NVIDIA Quadro P1000, I see the following results:
Command buffer submit time: 13us
Compute shader execution time: 19.616us
Differential Revision: https://reviews.llvm.org/D76799
AMDGPUPropagateAttributes can swap names while cloning a function.
Only do it if original symbol was not externally visible.
Differential Revision: https://reviews.llvm.org/D76789
Introduced by https://reviews.llvm.org/D72687. This condition can happen
when the tests are not being run at all, and we're only trying to generate
the libc++ headers.
Some tests do not fail at all when -verify is not supported, unless some
arbitrary warning flag is added to make them fail. We currently used
-Werror=unused-result to make them fail, but doing so makes the test
suite a lot more inscrutable. It seems better to just disable those
tests when -verify is not supported.
Differential Revision: https://reviews.llvm.org/D76256
Summary:
Some operations have custom syntax where an operand is always followed by a
specific token of streams if the operand is present. Parsing such operations
requires the ability to optionally parse an operand. Provide a relevant
function in the custom Op parser.
Differential Revision: https://reviews.llvm.org/D76779
Summary:
This patch allows code-sinking in InstCombine to be performed when instruction have uses in llvm.assume.
Use are considered droppable when it is preferable to modify the User such that the use disappears rather than to prevent a transformation because of the use.
for now uses are considered droppable if they are in an llvm.assume.
Reviewers: jdoerfert, nikic, spatel, lebedev.ri, sstefan1
Reviewed By: jdoerfert
Subscribers: hiraditya, llvm-commits
Tags: #llvm
Differential Revision: https://reviews.llvm.org/D73832
Summary:
Rename `pred_const_range` to `const_pred_range` to make it consistent with
the other pred/succ iterator definitions.
Reviewers: nicholas, dblaikie, nlewycky
Subscribers: llvm-commits
Tags: #llvm
Differential Revision: https://reviews.llvm.org/D75962
Summary:
Rename `succ_const_iterator` to `const_succ_iterator` and
`succ_const_range` to `const_succ_range` for consistency with the
predecessor iterators, and the corresponding iterators in
MachineBasicBlock.
Reviewers: nicholas, dblaikie, nlewycky
Subscribers: hiraditya, bmahjour, llvm-commits
Tags: #llvm
Differential Revision: https://reviews.llvm.org/D75952
Summary:
The attribute parser fails to correctly parse unsigned 64 bit
attributes as the check `isNegative ? (int64_t)-val.getValue() >= 0
: (int64_t)val.getValue() < 0` will falsely detect an overflow for
unsigned values larger than 2^63-1.
This patch reworks the overflow logic to instead of doing arithmetic
on int64_t use APInt::isSignBitSet() and knowledge of the attribute
type.
Test-cases which verify the de-facto behavior of the parser and
triggered the previous faulty handing of unsigned 64 bit attrbutes are
also added.
Differential Revision: https://reviews.llvm.org/D76493
As reported in PR45298 and PR45299, vector_size type checking would
crash when done in a situation where the scalar is dependent, such as
a member of the current instantiation.
This is because the scalar checking ensures that you can implicitly
convert a value to a vector-type as long as it doesn't require
truncation. It does this by using the constant evaluator to get the
value as a float. Unfortunately, if the scalar is dependent (such as a
member of the current instantiation), we would hit the assert in the
evaluator.
This patch suppresses the truncation- of-value check in the first phase
of translation. All values are properly errored upon instantiation. This
has one minor regression, in that previously in a non-asserts build,
template<typename T>
struct S {
float4 f(float4 f) {
return k + f;
}
static constexpr k = 1.1; // causes a truncation on conversion.
};
would error immediately. Because 'k' is value dependent (as a
member-of-the-current-instantiation), this would still be evaluatable
(despite normally asserting). Due to this patch, this diagnostic is
delayed until instantiation time.
Because using -print-imports is not thread-safe, make the test rely on llvm-dis instead.
Also cover the ICALL-PROM part as intended originally.
Differential Revision: https://reviews.llvm.org/D76775
Add support for combining shuffles to AVX512 truncate instructions - another step toward fixing D56387/D66004. It also fixes SKX code on PR31443.
We could probably extend this further to handle non-VLX truncation cases.
InnerLoopVectorizer's code called during VPlan execution still relies on
original IR's def-use relations to decide which vector code to generate,
limiting VPlan transformations ability to modify def-use relations and still
have ILV generate the vector code.
This commit introduces a VPValue for VPWidenMemoryInstructionRecipe to use as
the stored value. The recipe is generated with a VPValue wrapping the stored
value of the scalar store. This reduces ingredient def-use usage by ILV as a
step towards full VPlan-based def-use relations.
Differential Revision: https://reviews.llvm.org/D76373
This adds a very simple loader. This will be extended to a full loader
in future patches. A utility rule to add unittests has been added to
serve us while we are building out the full loader.
Reviewers: abrachet, phosek
Differential Revision: https://reviews.llvm.org/D76412
Summary:
This patch implements the following CDE intrinsics:
T __arm_vcx1q_m(int coproc, T inactive, uint32_t imm, mve_pred_t p);
T __arm_vcx2q_m(int coproc, T inactive, U n, uint32_t imm, mve_pred_t p);
T __arm_vcx3q_m(int coproc, T inactive, U n, V m, uint32_t imm, mve_pred_t p);
T __arm_vcx1qa_m(int coproc, T acc, uint32_t imm, mve_pred_t p);
T __arm_vcx2qa_m(int coproc, T acc, U n, uint32_t imm, mve_pred_t p);
T __arm_vcx3qa_m(int coproc, T acc, U n, V m, uint32_t imm, mve_pred_t p);
The intrinsics are not part of the released ACLE spec, but internally at
Arm we have reached consensus to add them to the next ACLE release.
Reviewers: simon_tatham, MarkMurrayARM, ostannard, dmgreen
Reviewed By: simon_tatham
Subscribers: kristof.beyls, hiraditya, danielkiss, cfe-commits
Tags: #clang
Differential Revision: https://reviews.llvm.org/D76610
Summary:
One change: because there's no way to signal failure individually for
each cursor, we now "succeed" with an empty range with no parent if a
cursor doesn't point at anything.
Reviewers: usaxena95
Subscribers: ilya-biryukov, MaskRay, jkorous, arphaman, kadircet, cfe-commits
Tags: #clang
Differential Revision: https://reviews.llvm.org/D76741
Summary:
This patch adds a virtual method `getCPUCacheLineSize()` to `TargetInfo`. Currently, I've only implemented the method in `X86TargetInfo`. It's extremely important that each CPU's cache line size correct (e.g., we can't just define it as `64` across the board) so, it has been a little slow getting to this point.
I'll work on the ARM CPUs next, but that will probably come later in a different patch.
Tags: #clang
Differential Revision: https://reviews.llvm.org/D74918
to reduce spurios changes in patches after clang-formatting them. In
particular, these files contain long enums that clang-format reformats
in their entirety if e.g. an element is added.
Reviews having this problem include https://reviews.llvm.org/D76342 and
https://reviews.llvm.org/D71447.
When creating and destroying fibers in tsan a thread state
is created and destroyed. Currently, a memory mapping is
leaked with each fiber (in __tsan_destroy_fiber).
This causes applications with many short running fibers
to crash or hang because of linux vm.max_map_count.
The root of this is that ThreadState holds a pointer to
ThreadSignalContext for handling signals. The initialization
and destruction of it is tied to platform specific events
in tsan_interceptors_posix and missed when destroying a fiber
(specifically, SigCtx is used to lazily create the
ThreadSignalContext in tsan_interceptors_posix). This patch
cleans up the memory by inverting the control from the
platform specific code calling the generic ThreadFinish to
ThreadFinish calling a platform specific clean-up routine
after finishing a thread.
The relevant code causing the leak with fibers is the fiber destruction:
void FiberDestroy(ThreadState *thr, uptr pc, ThreadState *fiber) {
FiberSwitchImpl(thr, fiber);
ThreadFinish(fiber);
FiberSwitchImpl(fiber, thr);
internal_free(fiber);
}
I would appreciate feedback if this way of fixing the leak is ok.
Also, I think it would be worthwhile to more closely look at the
lifecycle of ThreadState (i.e. it uses no constructor/destructor,
thus requiring manual callbacks for cleanup) and how OS-Threads/user
level fibers are differentiated in the codebase. I would be happy to
contribute more if someone could point me at the right place to
discuss this issue.
Reviewed-in: https://reviews.llvm.org/D76073
Author: Florian (Florian)
tsan while used by golang's race detector was not working on alpine
linux, since it is using musl-c instead of glibc. Since alpine is very
popular distribution for container deployments, having working race
detector would be nice. This commits adds some ifdefs to get it working.
It fixes https://github.com/golang/go/issues/14481 on golang's issue tracker.
Reviewed-in: https://reviews.llvm.org/D75849
Author: graywolf-at-work (Tomas Volf)
The reason is to add .yaml as a valid test suffix. The test folder
contains one yaml file, which wasn't being run because of that.
Unsurprisingly the test fails, but this was not because the underlying
functionality was broken, but rather because the test was setup
incorrectly (most likely due to overly aggressive simplification of the
test data on my part).
Therefore this patch also tweaks the test inputs in order to test what
they are supposed to test, and also updates some other breakpad tests
(because they depend on the same inputs as this one) to be more
realistic -- specifically it avoids putting symbols to the first page of
the module, as that's where normally the COFF header would reside.
Move ARM ConstantIsland and LowOverheadLopps passes later in the pipeline
such that they will be run after the upcoming Machine Outlining pass.
Differential Revision: https://reviews.llvm.org/D76065
Summary: The current ConvertStandardToLLVM phase lowers the standard TanHOp to function calls to external tanh symbols. However, this leads to misunderstandings since these external symbols are not defined anywhere. This commit removes the TanHOp lowering functionality from ConvertStandardToLLVM, adapts the LowerGpuOpsToNVVMOps and LowerGpuOpsToROCDLOps passes and adjusts the affected test cases.
Reviewers: mravishankar, herhut
Subscribers: jholewinski, mehdi_amini, rriddle, jpienaar, burmako, shauheen, antiagainst, nicolasvasilache, csigg, arpith-jacob, mgester, lucyrfox, aartbik, liufengdb, Joonsoo, llvm-commits
Tags: #llvm
Differential Revision: https://reviews.llvm.org/D75509