Make sure we use the libc++ from the build dir. Currently, by passing
-stdlib=libc++, we might pick up the system libc++. This change ensures
that if LLVM_LIBS_DIR is set, we try to use the libc++ from there.
Differential revision: https://reviews.llvm.org/D129166
A previous revision implemented expand/collapse reshaping between
dense and sparse tensors for sparse2dense and dense2sparse since those
could use the "cheap" view reshape on the already materialized
dense tensor (at either the input or output side), and do some
reshuffling from or to sparse. The dense2dense case, as always,
is handled with a "cheap" view change.
This revision implements the sparse2sparse cases. Lacking any "view"
support on sparse tensors this operation necessarily has to perform
data reshuffling on both ends.
Tracker for improving this:
https://github.com/llvm/llvm-project/issues/56477
Reviewed By: bixia
Differential Revision: https://reviews.llvm.org/D129416
This patch adds a new flag vfsoverlay similar to clang’s
ivfsoverlay flag. This is helpful when compiling on case
sensitive file systems when cross compiling to Windows.
Particularly when compiling third party code containing
\#pragma comment(“linker”, “/defaultlib:...”) which
can’t be easily changed.
Differential Revision: https://reviews.llvm.org/D125800
As far as I can tell what was happening in the original code is
that the getNode call receives the same operands as the original
node with different SDNodeFlags. The logic inside getNode detects
that the node already exists and intersects the flags into the
existing node and returns it. This results in Op and NewOp for the
TLO.CombineTo call always being the same node.
We may have already called CombineTo as part of the recursive handling.
A second call to CombineTo as we unwind the recursion overwrites
the previous CombineTo. I think this means any time we updated the
poison flags that was the only change that ends up getting made
and we relied on DAGCombiner to revisit and call SimplifyDemandedBits
again. The second time the poison flags wouldn't need to be dropped
and we would keep the CombineTo call from further down the recursion.
We can instead call setFlags to drop the poison flags and remove the
call to TLO.CombineTo. This way we keep the CombineTo from deeper in
the recursion which should be more efficient.
Reviewed By: spatel
Differential Revision: https://reviews.llvm.org/D129511
Currently the autogenerated regalloc model will sometimes
output an incorrect LR index to evict instead of the first LR
with with the mask set to 1. This trips an assertion within
the MLRegallocAdvisor that the evicted LR has a mask of 1. This
patch, made possible by https://reviews.llvm.org/D124565, simplifies
the autogenerated model by taking away all unnecessary features and
getting rid of the functions that were previously to mix in all
the necessary inputs so they wouldn't get pruned by the Tensorflow
XLA AOT compiler. This is no longer necessary after the previously
mentioned patch. This also fixes the nondeterministic behavior
that is sometimes observed where the autogenerated model will
simply output 0 instead of the correct index.
Reviewed By: yundiqian
Differential Revision: https://reviews.llvm.org/D129254
It is generally not a good idea to mix usage of glibc headers and Linux UAPI
headers (https://sourceware.org/glibc/wiki/Synchronizing_Headers). In glibc
since 7eae6a91e9b1670330c9f15730082c91c0b1d570 (milestone: 2.36), sys/mount.h
defines `fsconfig_command` which conflicts with linux/mount.h:
.../usr/include/linux/mount.h:95:6: error: redeclaration of ‘enum fsconfig_command’
Remove #include <linux/fs.h> which pulls in linux/mount.h. Expand its 4 macros manually.
Android sys/mount.h doesn't define BLKBSZGET and it still needs linux/fs.h.
In the long term we should move Linux specific definitions to sanitizer_platform_limits_linux.cpp
but this commit is easy to cherry pick into older compiler-rt releases.
Fix https://github.com/llvm/llvm-project/issues/56421
Reviewed By: #sanitizers, vitalybuka, zatrazz
Differential Revision: https://reviews.llvm.org/D129471
This patch adds the necessary changes required to bundle and wrap HIP
files. The bundling is done using `clang-offload-bundler` currently to
mimic `fatbinary` and the wrapping is done using very similar runtime
calls to CUDA. This still does not support managed / surface / texture
variables, that would require some additional information in the entry.
One difference in the codegeneration with AMD is that I don't check if
the handle is null before destructing it, I'm not sure if that's
required.
With this we should be able to support HIP with the new driver.
Depends on D128850
Reviewed By: JonChesterfield
Differential Revision: https://reviews.llvm.org/D128914
This patch adds the small change required to output offloading entried
for HIP instead of CUDA. These should be placed in different sections so
because they need to be distinct to the offloading toolchain, otherwise
we'd have HIP trying to register CUDA kernels or vice-versa. This patch will
precede support for HIP in the linker wrapper.
Reviewed By: yaxunl, tra
Differential Revision: https://reviews.llvm.org/D128850
OpenMP supports multiple offloading toolchains and architectures. In
order to support this we originally used `getArgsForToolchain` to get
the arguments only intended for each toolchain. This allowed users to
manually specify if an `--offload-arch=` argument was intended for which
toolchain using `-Xopenmp-target=` or other methods. For example,
```
clang input.c -fopenmp -fopenmp-targets=nvptx64,amdgcn -Xopenmp-target=nvptx64 --offload-arch=sm_70 -Xopenmp-target=amdgcn --offload-arch=gfx908
```
However, this was causing problems with the AMDGPU toolchain. This is
because the AMDGPU toolchain for OpenMP uses an `amdgpu` arch to determine the
architecture. If this tool is not availible the compiler will exit with an error
even when manually specifying the architecture. This patch pulls out the logic in
`getArgsForToolchain` and specializes it for extracting `--offload-arch`
arguments to avoid this.
Reviewed By: JonChesterfield, yaxunl
Differential Revision: https://reviews.llvm.org/D129435
Users upgrading to PHP 8.1 might start observing failures with `arc`.
Commit @ychen's suggestions as a patch in tree that can be applied since
arcanist is no longer accepting patches.
Also, remove the suggestion to apply an external patch updating CA
certs. It seems that this was fixed in upstream arcanist before they
stopped accepting patches. Compare
e3659d43d8
vs
13d3a3c3b1
Link: https://secure.phabricator.com/book/phabcontrib/article/contributing_code/
Reviewed By: MaskRay
Differential Revision: https://reviews.llvm.org/D129232
This fixes https://github.com/llvm/llvm-project/issues/56238. ld64.lld currently does not generate __dof section in Mach-O, and -no_dtrace_dof option is on by default. However when there are user-defined dtrace symbols, ld64.lld will treat them as undefined symbols, which causes the linking to fail because lld cannot find their definitions. This patch allows ld64.lld to rewrite the instructions calling dtrace symbols to instructions like nop as what ld64 does; therefore, when encountered with user-provided dtrace probes, the linking can still succeed.
I'm not sure whether support for dtrace is expected in lld, so for now I didn't add codes to make lld emit __dof section like ld64, and only made it possible to link with dtrace symbols provided. If this feature is needed, I can add that part in Dtrace.cpp & Dtrace.h.
Reviewed By: int3, #lld-macho
Differential Revision: https://reviews.llvm.org/D129062
Previously, FIRLangRef.md was incorrectly formatted.
This was due to how FIRLangRef.md had no page header,
and so the first entry would render incorrectly.
This patch introduces a header file, which is prepended to the FIRLangRef
before it becomes a HTML file. The header is currently brief
but can be expanded upon at a later date if required.
This formatting fix also means the index page
can correctly generate a link to FIRLangRef.html and as such,
this patch also removes FIRLangRef from the sidebar and adds it to the main list of links.
Depends on D128650
Reviewed By: kiranchandramohan
Differential Revision: https://reviews.llvm.org/D129186
It is generally not a good idea to mix usage of glibc headers and Linux UAPI
headers (https://sourceware.org/glibc/wiki/Synchronizing_Headers). In glibc
since 7eae6a91e9b1670330c9f15730082c91c0b1d570 (milestone: 2.36), sys/mount.h
defines `fsconfig_command` which conflicts with linux/mount.h:
.../usr/include/linux/mount.h:95:6: error: redeclaration of ‘enum fsconfig_command’
Remove #include <linux/fs.h> which pulls in linux/mount.h. Expand its 4 macros manually.
Fix https://github.com/llvm/llvm-project/issues/56421
Reviewed By: #sanitizers, vitalybuka, zatrazz
Differential Revision: https://reviews.llvm.org/D129471
The Fortran Language Reference is currently generated via tablegen,
however isn't present on flang.llvm.org/docs/
This patch adds FIRLangRef.md to the flang/docs directoy,
and adds a link to the generated HTML file in sidebar
under the 'Documentation' heading.
Reviewed By: kiranchandramohan
Differential Revision: https://reviews.llvm.org/D128650
Add a semantics test for the intrinsic function image_status. Add
a check and restriction on the image argument in image_status,
ensuring that it is a positive value. Add same check on the
size argument of the intrinsic ishftc. Add another check on
the shift argument of ishftc, ensuring that it is less than or
equal to the size argument. Add a short semantics test checking
these restrictions in ishftc function calls.
Reviewed By: klausler
Differential Revision: https://reviews.llvm.org/D128009
This custom isel was used to split the lo12 bits of the imm so that
they could be folded into load/store addresses via a post-isel
peephole.
This patch instead splits the immediate during isel and folds the
lo12 removing the need for the post-isel peephole to do anything.
After this we'll be able to remove the post-isel peephole.
Reviewed By: asb, luismarques
Differential Revision: https://reviews.llvm.org/D129450
After https://reviews.llvm.org/D129237, the assumption
that any non-null data contains a valid vmar handle is no
longer true. Generally this code here needs cleanup, but
in the meantime this fixes errors on Fuchsia.
Differential Revision: https://reviews.llvm.org/D129331
This patch adds OMPIRBuilder support for the simdlen clause for the
simd directive. It uses the simdlen support in OpenMPIRBuilder when
it is enabled in Clang. Simdlen is lowered by OpenMPIRBuilder by
generating the loop.vectorize.width metadata.
Reviewed By: jdoerfert, Meinersbur
Differential Revision: https://reviews.llvm.org/D129149
Tosa to Linalg conversion crashes when input tensor is a float type other than fp32.
Because tosa.clamp and tosa.reluN have fp32 min/max attribute which is converted as arith.constant with the attribute type.
This commit fixes the crash by correctly setting the float constant type from the input tensor.
Reviewed By: eric-k256
Differential Revision: https://reviews.llvm.org/D128630