This patch adds the `llvm_omp_target_dynamic_shared_alloc` function to
the `omp.h` header file so users can access it by default. Also changed
the name to keep it consistent with the other target allocators. Added
some documentation so users know how to use it. Didn't add the interface
for Fortran since there's no way to test it right now.
Reviewed By: jdoerfert
Differential Revision: https://reviews.llvm.org/D123246
The target allocators have been supported for NVPTX offloading for
awhile. The tests should use the allocators instead of calling the
functions manually. Also the comments indicating these being a preview
should be removed.
Reviewed By: jdoerfert
Differential Revision: https://reviews.llvm.org/D123242
As noticed in D87637, when LLDB crashes, we only print stack traces if
LLDB is directly executed, not when used via Python bindings. Enabling
this by default may be undesirable (libraries shouldn't be messing with
signal handlers), so make this an explicit opt-in.
I "commandeered" this patch from Jordan Rupprecht who put this up for
review originally.
Differential revision: https://reviews.llvm.org/D91835
In addition, fixed a small bug with padding incorrectly inferring output shape for dynaic inputs in convolution
Reviewed By: jpienaar
Differential Revision: https://reviews.llvm.org/D121872
The code to check if the regular LTO summary should be emitted and to
add the corresponding module flags was duplicated in the
'EmitAssemblyHelper::EmitAssemblyWithLegacyPassManager' and
'EmitAssemblyHelper::RunOptimizationPipeline' methods.
In order to eliminate these code duplications, the
'EmitAssemblyHelper::shouldEmitRegularLTOSummary' method has been
extracted. The method returns a bool value, the value is 'true' if the
module summary should be emitted. The patch keeps the setting of the
module flags inline.
Reviewed By: tejohnson
Differential Revision: https://reviews.llvm.org/D123026
This case is handled in neither the folding or canonicalization
patterns. The folding pattern cannot generate new broadcast ops,
so it should be handled by the canonicalization pattern.
Reviewed By: ThomasRaoux
Differential Revision: https://reviews.llvm.org/D123307
We should only process APIs declared in the command line inputs to avoid
drowning the ExtractAPI output with symbols the user doesn't care about.
This is achieved by keeping track of the provided input files and
checking that the associated Decl or Macro is declared in one of those files.
Differential Revision: https://reviews.llvm.org/D123148
This lines up with other parts of the codebase that only use special
knowledge about allocator functions if they're builtins.
Differential Revision: https://reviews.llvm.org/D123053
This is part of being able to get rid of two more columns in
MemoryBuiltins.cpp's large table. We'll have two more changes before
we can finish the job.
Differential Revision: https://reviews.llvm.org/D119582
Sometimes we can infer an align from an allocalign but the function
already promised it'd be more-aligned than the allocalign and there's an
existing align that we shouldn't reduce. Make sure we handle that
correctly.
Differential Revision: https://reviews.llvm.org/D121642
This got changed to use hasAttrSomewhere() during review, and I didn't
notice until today when I was writing some tests for another part of
this system that using hasAttrSomewhere only checked the callsite for
allocalign, rather than both the callsite and the definition. This fixes
that by introducing a helper method.
Differential Revision: https://reviews.llvm.org/D121641
This patch synchronizes the structure of the templates with those
in RISCVInstrInfoVVLPatterns.td so that we get patterns with .vx
on the left hand side.
Reviewed By: rogfer01
Differential Revision: https://reviews.llvm.org/D123255
There is a bug in `DeclarationFragments::appendSpace` where the space
character is added to a local copy of the last fragment.
Differential Revision: https://reviews.llvm.org/D123259
This matches VPatIntegerSetCCVL_VI_Swappable. But as noted in the
FIXME this may only be needed due to lack of canonicalization on
VP_SETCC.
Reviewed By: rogfer01
Differential Revision: https://reviews.llvm.org/D123239
This change is essentially a mechanical change which moves the thread
creation and join implementations from src/threads/linux to
src/__support/threads/linux/thread.h. The idea being that, in future, a
pthread implementation can reuse the common thread implementations in
src/__support/threads.
Reviewed By: lntue
Differential Revision: https://reviews.llvm.org/D123287
Introduce _LIBCPP_ASSERTIONS_DISABLE_ASSUME which makes _LIBCPP_ASSERT
not call __builtin_assume when _LIBCPP_ENABLE_ASSERTIONS == 0.
Calling __builtin_assume was introduced in D122397.
__builtin_assume is generally supposed to improve optimizations, but can
cause regressions when LLVM has trouble handling the calls to
`llvm.assume()` (see comments in D122397).
Reviewed By: philnik
Differential Revision: https://reviews.llvm.org/D123175
Summary:
If implicitarg_ptr intrinsic is not used, set implicit kernarg size to 0, otherwise
set it to 256 bytes for code object version 5 (and beyond).
Reviewers: arsenm
Differential Revision: https://reviews.llvm.org/D123262
rGa3b8695bf592 enabled this for znver3, but AMD SoG, Agner and uops.info all agree that even znver1 has a fast per-lane shuffle op (VPSHUFB), but cross-lane shuffles seem to be slow (PERMPS etc.)
Fixes#44140
Differential Revision: https://reviews.llvm.org/D123306
- isValid: FileManager only ever returns valid FileEntries (see next point)
- construction from outside FileManager (both FileEntry and DirectoryEntry).
It's not possible to create a useful FileEntry this way, there are no setters.
This was only used in FileEntryTest, added a friend to enable this.
A real constructor is cleaner but requires larger changes to FileManager.
Differential Revision: https://reviews.llvm.org/D123197
In several cases, a doc is being generated from a .td file that includes
files containing other dialects. Specify the dialect for which the
documentation is being generated explicitly.
Subtraction was previously implemented recursively. This refactors it to be
non-recursive to avoid issues with potential stack overflows.
Reviewed By: Groverkss
Differential Revision: https://reviews.llvm.org/D123248
This is a lshr equivalent to D122340 - if we don't demand any of the additional sign bits introduced by the ashr, the lshr can be treated as an ashr and we can remove the shift entirely if we only demand already known sign bits.
Another step towards PR21929
https://alive2.llvm.org/ce/z/6f3kjq
Differential Revision: https://reviews.llvm.org/D123118