llvm-project

Commit Graph

Author	SHA1	Message	Date
Kevin Sala Penads	1081bb08cc	[OpenMP][libomptarget] Fix run region async condition This patch fixes a condition in the openmp/libomptarget/src/device.cpp file. The code was checking if the run_region plugin API function was implemented, but it should actually check the run_region_async function instead. Reviewed By: tianshilei1992 Differential Revision: https://reviews.llvm.org/D131782	2022-08-15 13:08:45 -04:00
Jennifer Yu	2ca27206f9	[OpenMP] Fix segmentation fault when data field is used in is_device_pt Currently, the field just emit map info for this pointer variable. It is failed at run time. For the fields, the PartialStruct is created and it needs call to emitCombinedEntry which create the base that covers all the pieces. The change is to generate map info as regular fields. Differential Revision: https://reviews.llvm.org/D129608	2022-08-12 17:10:26 -07:00
Johannes Doerfert	a8cda32909	[OpenMP][FIX] Ensure __kmpc_kernel_parallel is reachable The problem is we create the call to __kmpc_kernel_parallel in the openmp-opt pass but while we optimize the code, the call is not there yet. Thus, we assume we never reach it from __kmpc_target_deinit. That allows us to remove the store in there (`ParallelRegionFn = nullptr`), which leads to bad results later on. This is a shortstop solution until we come up with something better. Fixes https://github.com/llvm/llvm-project/issues/57064	2022-08-11 09:55:56 -05:00
Joseph Huber	fdbb15355e	[Libomptarget][CUDA] Check CUDA compatibilty correctly We recently added support for multi-architecture binaries in libomptarget. This is done by extracting the architecture from the embedded image and comparing it with the major and minor version supported by the current CUDA installation. Previously we just compared these directly, which was not correct for binary compatibility. The CUDA documentation states that we can consider any image with an equivalent major or a greater or equal to minor compatible with the current image. Change the check to use this new logic in the CUDA plugin. Fixes #57049 Reviewed By: jdoerfert, ye-luo Differential Revision: https://reviews.llvm.org/D131567	2022-08-10 11:15:27 -04:00
Fangrui Song	0972a390b9	LLVM_FALLTHROUGH => [[fallthrough]]. NFC	2022-08-09 04:06:52 +00:00
Jon Chesterfield	104f11630a	[nfc][openmp] clang-format system.cpp prior to D131401	2022-08-08 16:24:34 +01:00
Shilei Tian	294bbdc0b8	[NFC] Fix wrong header in `LibC.cpp`	2022-08-04 23:54:07 -04:00
Shilei Tian	459e3c5184	[OpenMP] Fix the test case issue that printf cannot be used in target region for AMDGPU	2022-08-04 14:48:48 -04:00
Shilei Tian	db5a2afa62	[OpenMP][DeviceRTL] Implement libc function `memcmp` We will add some simple implementation of libc functions starting from this patch, and the first one is `memcmp`, which is reported in #56929. Note that `malloc` and `free` are not included in this patch because of the use of `declare variant`. In the near future we will implement the two functions w/o using any vendor provided function. This fixes #56929. Reviewed By: jhuber6 Differential Revision: https://reviews.llvm.org/D131182	2022-08-04 14:37:54 -04:00
Joseph Huber	b3335e8ed7	[Libomptarget][NFC] Clang format the AMDGPU plugin Summary: A previous patch did not format the plugin again after making changes. Ensure that libomptarget stays formatted.	2022-08-03 15:18:16 -04:00
Joseph Huber	2b7203a359	[Libomptarget] Deinitialize AMDGPU global state more intentionally A previous patch made the destruction of the HSA plugin more deterministic. However, there were still other global values that are not handled this way. When attempting to call a destructor kernel, the device would have already been uninitialized and we could not find the appropriate kernel to call. This is because they were stored in global containers that had their destructors called already. Merges this global state into the rest of the info state by putting those global values inside of the global pointer already allocated and deallocated by the constructor and destructor. This should allow the AMDGPU plugin to correctly identify the destructors if we were to run them. Reviewed By: JonChesterfield Differential Revision: https://reviews.llvm.org/D131011	2022-08-02 18:24:39 -04:00
Joseph Huber	5afb5312a0	[Libomptarget][NFC] Remove unused CMake file Summary: This file is no longer used, get rid of it.	2022-08-01 16:21:53 -04:00
Joseph Huber	51bda3a0e7	[Libomptarget] Replace std::vector with llvm::SmallVector The runtime makes some use of `std::vector` data structures. We should be able to replace these trivially with `llvm::SmallVector` instead. This should allow us to avoid heap allocations in the majority of cases now. Reviewed By: tianshilei1992 Differential Revision: https://reviews.llvm.org/D130927	2022-08-01 15:59:15 -04:00
Joseph Huber	1d03b2efcd	[Libomptarget] Disable testing map_back_race.cpp This test hasn't been fixed and causes spurious failures when testing. This patch sets it as unsupported until we have a reliable fix. Reviewed By: ronlieb Differential Revision: https://reviews.llvm.org/D130789	2022-07-30 15:01:47 -04:00
Jon Chesterfield	ed0f218115	[openmp][amdgpu] Tear down amdgpu plugin accurately Moves DeviceInfo global to heap to accurately control lifetime. Moves calls from libomptarget to deinit_plugin later, plugins need to stay alive until very shortly before libomptarget is destructed. Leaving the deinit_plugin calls where initially inserted hits use after free from the dynamic_module.c offloading test (verified with valgrind that the new location is sound with respect to this) Reviewed By: tianshilei1992 Differential Revision: https://reviews.llvm.org/D130714	2022-07-28 20:00:03 +01:00
Jon Chesterfield	c214cb6a68	[amdgpu][openmp][nfc] Restore stb_local on DeviceInfo symbol	2022-07-28 16:50:46 +01:00
Jon Chesterfield	75aa521064	[openmp][amdgpu] Move global DeviceInfo behind call syntax prior to using D130712	2022-07-28 16:40:42 +01:00
Jon Chesterfield	1f9d3974e4	[openmp] Introduce optional plugin init/deinit functions Will allow plugins to migrate away from using global variables to manage lifetime, which will fix a segfault discovered in relation to D127432 Reviewed By: jhuber6 Differential Revision: https://reviews.llvm.org/D130712	2022-07-28 16:21:38 +01:00
Joseph Huber	b08369f7f2	Revert "[OpenMP] Remove noinline attributes in the device runtime" The behaviour of this patch is not great, but it has some side-effects that are required for OpenMPOpt to work. The problem is that when we use `-mlink-builtin-bitcode` we only import used symbols from the runtime. Then OpenMPOpt will insert calls to symbols that were not previously included. This patch removed this implicit behaviour as these functions were kept alive by the `noinline` simply because it kept calls to them in the module. This caused regression in some tests that relied on some OpenMPOpt passes without using LTO. Reverting for the LLVM15 release but will try to fix it more correctly on main. This reverts commit `d61d72dae6`. Fixes #56752	2022-07-27 11:09:18 -04:00
Saiyedul Islam	4075a811ad	[Libomptarget] Add checks for AMDGPU TargetID using new image info This patch extends the is_valid_binary routine to also check if the binary's target ID matches the one parsed from the system's runtime environment. This should allow us to only use the binary whose compute capability matches, allowing us to support basic multi-architecture binaries for AMDGPU. It also handles compatibility testing of target IDs of the image and the enviornment. Depends on D127432 Reviewed By: JonChesterfield Differential Revision: https://reviews.llvm.org/D127769	2022-07-26 02:44:31 -05:00
Joseph Huber	8c626fc0c8	[Libomptarget] Reintroduce host architecture checks for device RTL A previous patch removed the need to set the auxiliary architecture as it was no longer needed for the clang invocation after moving to using the clang frontend. However, this had a second use of preventing unsupported host architectures from building the device runtime. This caused failures when trying to build on 32-bit hosts for example. Fixes #56699 Reviewed By: tianshilei1992 Differential Revision: https://reviews.llvm.org/D130509	2022-07-25 17:01:12 -04:00
Joseph Huber	d61d72dae6	[OpenMP] Remove noinline attributes in the device runtime We previously used the `noinline` attributes to specify some defintions which should be kept alive in the runtime. These were then stripped immediately in the OpenMPOpt module pass. However, Since the changes in D130298, we not explicitly state which functions will have external visiblity in the bitcode library. Additionally the OpenMPOpt module pass should run before the inliner pass, so this shouldn't make a difference in whether or not the functions will be alive for the initial pass of OpenMPOpt. This should simplify the interface, and additionally save time spend on scanning funciton names for noinline. Reviewed By: jdoerfert Differential Revision: https://reviews.llvm.org/D130368	2022-07-25 15:44:50 -04:00
Saiyedul Islam	4cf30c5157	Revert "Revert "Revert "[Libomptarget] Add checks for AMDGPU TargetID using new image info""" This reverts commit `281eb9223c`.	2022-07-25 11:35:37 -05:00
Saiyedul Islam	281eb9223c	Revert "Revert "[Libomptarget] Add checks for AMDGPU TargetID using new image info"" This reverts commit `8cbf4a386b`.	2022-07-25 08:32:26 -05:00
Saiyedul Islam	8cbf4a386b	Revert "[Libomptarget] Add checks for AMDGPU TargetID using new image info" This reverts commit `471f2abc62`.	2022-07-25 05:32:59 -05:00
Saiyedul Islam	471f2abc62	[Libomptarget] Add checks for AMDGPU TargetID using new image info This patch extends the is_valid_binary routine to also check if the binary's target ID matches the one parsed from the system's runtime environment. This should allow us to only use the binary whose compute capability matches, allowing us to support basic multi-architecture binaries for AMDGPU. It also handles compatibility testing of target IDs of the image and the enviornment. Depends on D127432 Differential Revision: https://reviews.llvm.org/D127769	2022-07-25 04:44:36 -05:00
Shilei Tian	b95d31a849	[OpenMP][Offloading] Enlarge the work size of `wtime.c` in case of any noise	2022-07-22 16:03:39 -04:00
Joel E. Denny	cfa6e79df3	[Libomptarget] Don't report lack of CUDA devices Sometimes libomptarget's CUDA plugin produces unhelpful diagnostics about a lack of CUDA devices before an application runs: ``` $ clang -fopenmp -fopenmp-targets=amdgcn-amd-amdhsa hello-world.c $ ./a.out CUDA error: Error returned from cuInit CUDA error: no CUDA-capable device is detected Hello World: 4 ``` This can happen when the CUDA plugin was built but all CUDA devices are currently disabled in some manner, perhaps because `CUDA_VISIBLE_DEVICES` is set to the empty string. As shown in the above example, it can even happen when we haven't compiled the application for offloading to CUDA. The following code from `openmp/libomptarget/plugins/cuda/src/rtl.cpp` appears to be intended to handle this case, and it chooses not to write a diagnostic to stderr unless debugging is enabled: ``` if (NumberOfDevices == 0) { DP("There are no devices supporting CUDA.\n"); return; } ``` The problem is that the above code is never reached because the earlier `cuInit` returns `CUDA_ERROR_NO_DEVICE`. This patch handles that `cuInit` case in the same manner as the above code handles the `NumberOfDevices == 0` case. Reviewed By: tianshilei1992 Differential Revision: https://reviews.llvm.org/D130371	2022-07-22 14:46:45 -04:00
Shilei Tian	0c86c4f50c	[OpenMP] Fix test error introduced in D130179	2022-07-22 14:16:47 -04:00
Shilei Tian	602e0eb9f0	[OpenMP][DeviceRTL] Fix the issue that multiple calls to `omp_get_wtime` is optimized out by mistake Multiple calls to `omp_get_wtime` could be optimized out due to the function is mistakenly marked as `readnone`. This patch fixes the issue, and also add the support to run optimization on `libomptarget` tests. Reviewed By: jdoerfert Differential Revision: https://reviews.llvm.org/D130179	2022-07-22 13:46:45 -04:00
Shilei Tian	77cb30e3a6	Revert "[OpenMP][DeviceRTL] Fix the issue that multiple calls to `omp_get_wtime` is optimized out by mistake" This reverts commit `ad34f1dba8`.	2022-07-22 11:45:13 -04:00
Shilei Tian	ad34f1dba8	[OpenMP][DeviceRTL] Fix the issue that multiple calls to `omp_get_wtime` is optimized out by mistake Multiple calls to `omp_get_wtime` could be optimized out due to the function is mistakenly marked as `readnone`. This patch fixes the issue, and also add the support to run optimization on `libomptarget` tests. Reviewed By: jdoerfert Differential Revision: https://reviews.llvm.org/D130179	2022-07-22 11:43:30 -04:00
Joseph Huber	a3804a3145	[Libomptarget] Make the plugins link as LLVM libraries Previously we made `libomptarget` link as an LLVM library so we have access to the LLVM core libraries. After the initial patch stuck we can now apply the same changes to the plugins. This will allow us to use LLVM in all of `libomptarget` when we have uses for them. In the future this should allow us to remove the dependencies on `libelf`, `libffi`, and `dl`. Reviewed By: JonChesterfield Differential Revision: https://reviews.llvm.org/D130262	2022-07-22 09:34:12 -04:00
Joseph Huber	908054df4f	[Libomptarget] Only export needed definitions in the BC library This patch adds the use of the `-internalize-public-api-file` option in the internalization pass to internalize any definition that isn't explicitly needed for the interface. This will allow us to perform more optimizations on the file that normally would not have been possible with functions internal to the library not being internal. Depends on D130293 Reviewed By: jdoerfert Differential Revision: https://reviews.llvm.org/D130298	2022-07-22 08:24:35 -04:00
Joseph Huber	e82e07d74a	[Libomptarget] Build the DeviceRTL BC using clang directly Currently the bitcode library is build using the clang front-end manually. This was originally done because we did not support device only compilation. Now we support device only compilation, at least for a single offloading toolchain, so we can instead use clang directly rather than using the front-end. This saves us needing to define things like `aux_triple`. Reviewed By: tianshilei1992 Differential Revision: https://reviews.llvm.org/D130293	2022-07-22 08:24:29 -04:00
Ron Lieberman	45a379ce2f	Revert "[Libomptarget] Stop testing CPU offloading with LTO" This reverts commit `3e8d46921f`.	2022-07-22 12:10:06 +00:00
Johannes Doerfert	1da6ae4b54	[OpenMP][FIX] Ensure thread and team state are defined properly The namespaces were missing causing the symbols to have "C" mangling. To avoid this in the future we qualify the names now fully.	2022-07-21 21:57:14 -05:00
Joseph Huber	3e8d46921f	[Libomptarget] Stop testing CPU offloading with LTO Summary: Some of the buildbots don't find the libraries because they don't build for the GPU. Although it should always be there it's unclear why these buildbots are having problemsd. LTO is only interesting on the GPU and these tests take extra time anyway so I'm just going to disable them for now.	2022-07-21 16:47:41 -04:00
John Ericson	07b749800c	[cmake] Don't export `LLVM_TOOLS_INSTALL_DIR` anymore First of all, `LLVM_TOOLS_INSTALL_DIR` put there breaks our NixOS builds, because `LLVM_TOOLS_INSTALL_DIR` defined the same as `CMAKE_INSTALL_BINDIR` becomes an absolute path, and then when downstream projects try to install there too this breaks because our builds always install to fresh directories for isolation's sake. Second of all, note that `LLVM_TOOLS_INSTALL_DIR` stands out against the other specially crafted `LLVM_CONFIG_*` variables substituted in `llvm/cmake/modules/LLVMConfig.cmake.in`. @beanz added it in `d0e1c2a550` to fix a dangling reference in `AddLLVM`, but I am suspicious of how this variable doesn't follow the pattern. Those other ones are carefully made to be build-time vs install-time variables depending on which `LLVMConfig.cmake` is being generated, are carefully made relative as appropriate, etc. etc. For my NixOS use-case they are also fine because they are never used as downstream install variables, only for reading not writing. To avoid the problems I face, and restore symmetry, I deleted the exported and arranged to have many `${project}_TOOLS_INSTALL_DIR`s. `AddLLVM` now instead expects each project to define its own, and they do so based on `CMAKE_INSTALL_BINDIR`. `LLVMConfig` still exports `LLVM_TOOLS_BINARY_DIR` which is the location for the tools defined in the usual way, matching the other remaining exported variables. For the `AddLLVM` changes, I tried to copy the existing pattern of internal vs non-internal or for LLVM vs for downstream function/macro names, but it would good to confirm I did that correctly. Reviewed By: nikic Differential Revision: https://reviews.llvm.org/D117977	2022-07-21 19:04:00 +00:00
Johannes Doerfert	d150152615	[OpenMP] Introduce more fine-grained control over the thread state use We can help optimizations by making sure we use the team state whenever it is clear there is no thread state. To this end we introduce a new state flag (`state::HasThreadState`) and explicit control for the `state::ValueRAII` helpers, including a dedicated "assert equal". Differential Revision: https://reviews.llvm.org/D130113	2022-07-21 12:30:38 -05:00
Johannes Doerfert	7472b42b78	[OpenMP] Use Undef instead of null as pointer for inactive lanes Our conditional writes in the runtime look like this: ``` if (active) *ptr = value; ``` In the RAII we need to assign `ptr` which comes from a lookup call. If a thread that is not the main thread calls lookup with the intention to write the pointer, we'll create a new thread state. As such, we need to avoid calling lookup for inactive threads. We used to use `nullptr` as their `ptr` value but that can cause pessimistic reasoning. We now use `undef` instead. Differential Revision: https://reviews.llvm.org/D130114	2022-07-21 12:28:45 -05:00
Johannes Doerfert	a42361dc1c	[OpenMP] Expose the state in the header to allow non-lto optimizations We used to inline the `lookup` calls such that the runtime had "known" access offsets when it was shipped. With the new static library build it doesn't as the lookup is an indirection we cannot look through. This should help us optimize the code better until we can do LTO for the runtime again. Differential Revision: https://reviews.llvm.org/D130111	2022-07-21 12:28:44 -05:00
Joseph Huber	e01ce4e88a	[Libomptarget] Add checks for CUDA subarchitecture using new info This patch extends the `is_valid_binary` routine to also check if the binary's architecture string matches the one parsed from the runtime. This should allow us to only use the binary whose compute capability matches, allowing us to support basic multi-architecture binaries for CUDA. Depends on D127432 Reviewed By: jdoerfert Differential Revision: https://reviews.llvm.org/D127505	2022-07-21 13:20:06 -04:00
Joseph Huber	fbcb1ee7f3	[Libomptarget] Add support for offloading binaries in libomptarget The previous path changed the linker wrapper to embed the offloading binary format inside the target image instead. This will allow us to more generically bundle metadata with these images, such as requires clauses or the target architecture it was compiled for. I wasn't sure how to handle this best, so I introduced a new type that replaces the old `__tgt_device_image` struct that we can expand inside the runtime library. I made the new `__tgt_device_binary` struct pretty much the same for now. In the future we could change this struct to pretty much be the `OffloadBinary` class in the future. Reviewed By: JonChesterfield Differential Revision: https://reviews.llvm.org/D127432	2022-07-21 13:20:04 -04:00
Joseph Huber	5d8a76feb0	[Libomptarget] Build the device library even if the sm list is empty We previously had some logic that stopped us from building the device runtime if there were no NVPTX architectures provided. This is incorrect because we could have AMDGPU libraries. Even if the lists are empty we should be able to attempt to build these and get dummy output. THis wilil make it much easier for our tooling which expects certain libraries. If the user wishes to disable the library entirely they should use `-DLIBOMPTARGET_BUILD_DEVICERTL_BCLIB=OFF" Reviewed By: tianshilei1992 Differential Revision: https://reviews.llvm.org/D130266	2022-07-21 10:57:47 -04:00
Joseph Huber	dc52712a06	[Libomptarget] Make libomptarget an LLVM library This patch makes libomptarget depend on LLVM libraries to be built. The reason for this is because we already have an implicit dependency on LLVM headers for ELF identification and extraction as well as an optional dependenly on the LLVMSupport library for time tracing information. Furthermore, there are changes in the future that require using more LLVM libraries, and will heavily simplify some future code as well as open up the large amount of useful LLVM libraries to libomptarget. This will make "standalone" builds of `libomptarget' more difficult for vendors wishing to ship their own. This will require a sufficiently new version of LLVM to be installed on the system that should be picked up by the existing handling for the implicit headers. The things this patch changes are as follows: - `libomptarget.so` links against LLVMSupport and LLVMObject - `libomptarget.so` is a symbolic link to `libomptarget.so.15` - If using a shared library build, user applications will depend on LLVM libraries as well - We can now use LLVM resources in Libomptarget. Note that this patch only changes this to apply to libomptarget itself, not the plugins. Additional patches will be necessary for that. Reviewed By: JonChesterfield Differential Revision: https://reviews.llvm.org/D129875	2022-07-20 15:58:06 -04:00
Joseph Huber	b5b20164d2	Revert "[Libomptarget] Make libomptarget an LLVM library" This reverts commit `643dfd97d5`. This patch still makes the AMDGPU buildbots unhappy. Reverting for now until the AMD folks figure it out.	2022-07-20 10:18:55 -04:00
Joseph Huber	6b0db92bbd	[Libomptarget] Fix LTO command line in test Summary: The test passed -offload-lto instead of -foffload-lto.	2022-07-20 10:18:55 -04:00
Joseph Huber	643dfd97d5	[Libomptarget] Make libomptarget an LLVM library This patch makes libomptarget depend on LLVM libraries to be built. The reason for this is because we already have an implicit dependency on LLVM headers for ELF identification and extraction as well as an optional dependenly on the LLVMSupport library for time tracing information. Furthermore, there are changes in the future that require using more LLVM libraries, and will heavily simplify some future code as well as open up the large amount of useful LLVM libraries to libomptarget. This will make "standalone" builds of `libomptarget' more difficult for vendors wishing to ship their own. This will require a sufficiently new version of LLVM to be installed on the system that should be picked up by the existing handling for the implicit headers. The things this patch changes are as follows: - `libomptarget.so` links against LLVMSupport and LLVMObject - `libomptarget.so` is a symbolic link to `libomptarget.so.15` - If using a shared library build, user applications will depend on LLVM libraries as well - We can now use LLVM resources in Libomptarget. Note that this patch only changes this to apply to libomptarget itself, not the plugins. Additional patches will be necessary for that. Reviewed By: JonChesterfield Differential Revision: https://reviews.llvm.org/D129875	2022-07-20 09:52:09 -04:00
Jon Chesterfield	e46f727b38	Revert "[Libomptarget] Make libomptarget an LLVM library" This reverts commit `70039be627`.	2022-07-19 17:59:45 +01:00

1 2 3 4 5 ...

994 Commits