Commit Graph

41 Commits

Author SHA1 Message Date
Chandler Carruth 57b08b0944 Update more file headers across all of the LLVM projects in the monorepo
to reflect the new license. These used slightly different spellings that
defeated my regular expressions.

We understand that people may be surprised that we're moving the header
entirely to discuss the new license. We checked this carefully with the
Foundation's lawyer and we believe this is the correct approach.

Essentially, all code in the project is now made available by the LLVM
project under our new license, so you will see that the license headers
include that license only. Some of our contributors have contributed
code under our old license, and accordingly, we have retained a copy of
our old license notice in the top-level files in each project and
repository.

llvm-svn: 351648
2019-01-19 10:56:40 +00:00
Philipp Schaad cf0a22f786 [GPUJIT] Improved temporary file handling.
Summary: Imporved the way the GPUJIT handles temporary files for Intel's Beignet.

Reviewers: bollu, grosser

Reviewed By: grosser

Subscribers: philip.pfaffe, pollydev

Differential Revision: https://reviews.llvm.org/D37691

llvm-svn: 313623
2017-09-19 10:41:29 +00:00
Philipp Schaad 8cb2e3245c [Polly][GPGPU] Fixed undefined reference for CUDA's managed memory in Runtime library.
llvm-svn: 311848
2017-08-27 12:50:51 +00:00
Siddharth Bhat 14544a8068 [GPUJIT] Make max managed pointers an environment variable.
This was originally a `#define`. It is much easier to play around with
this as an environment variable when we run on large programs.

Differential Revision: https://reviews.llvm.org/D37012

llvm-svn: 311471
2017-08-22 17:32:27 +00:00
Siddharth Bhat 9a5a278f78 [GPUJIT] Switch from Runtime API calls for managed memory to Driver API calls.
We now load the function pointer for `cuMemAllocManaged` dynamically, so
it should be possible to compile `GPUJIT` on non-CUDA systems again.

It should now be possible to link on non-cuda systems again.

Thanks to Philipp Schaad for noticing this inconsitency.

Differential Revision: https://reviews.llvm.org/D36921

llvm-svn: 311289
2017-08-20 13:38:04 +00:00
Siddharth Bhat bb30377c5a [Polly] [GPUJIT] Set min size to 1 on CUDA allocation calls. [NFC]
Requesting size 0 allocations from `cuMalloc` / `cuMallocManaged` fails.
If there is a size 0 allocation that can be statically proved, the we
fail at PPCGCodeGeneration. This is because if size 0 allocation could
take place, we should not generate code that tries to use this array.

However, there are cases where we cannot statically prove this, and at
runtime we get a request for 0 bytes of memory. We choose to allocate
size 1 to allow the program to continue running.

Differential Revision: https://reviews.llvm.org/D36751

llvm-svn: 310941
2017-08-15 18:21:38 +00:00
Siddharth Bhat 8ff723dcf1 [NFC] [GPUJIT] Print line number & size information on allocateMemoryForDeviceCuda failure
- It's useful to know the amount of memory asked for since, for example,
  asking for `0` bytes of memory is illegal.

- Line number is helpful since we print the same message in the function
  at different points.

llvm-svn: 310340
2017-08-08 09:03:27 +00:00
Siddharth Bhat f23bb4a8ba [GPUJIT] Add GPUJIT APIs for allocating and freeing managed memory.
We introduce `polly_mallocManaged` and `polly_freeManaged` as
proxies for `cudaMallocManaged` / `cudaFree`. This is currently not
used by Polly. It is auxiliary code that is used in `COSMO`.

This is useful because `polly_mallocManaged` matches the signature of `malloc`,
while `cudaMallocManaged` does not. We introduce `polly_freeManaged` for
symmetry.

We use this in COSMO to use the unified memory feature of the newer
CUDA APIs (>= 6).

Differential Revision: https://reviews.llvm.org/D35991

llvm-svn: 309808
2017-08-02 12:23:22 +00:00
Siddharth Bhat b1a52abd87 [GPUJIT] Teach GPUJIT to use a pre-existing CUDA context if available.
On mixing the driver and runtime APIs, it is quite possible that a
context already exists due to runtime API usage. In this case, Polly should
try to use the same context.

This patch teaches GPUJIT to detect that a context exists and how to
pick up this context.

Without this, calling `cudaMallocManaged`, for example, before a
polly-generated kernel launch causes P100 to *hang*.

This is a part of (https://reviews.llvm.org/D35991) that was extracted
out.

Differential Revision: https://reviews.llvm.org/D36162

llvm-svn: 309802
2017-08-02 09:19:42 +00:00
Siddharth Bhat 442e722c1e [GPUJIT] Call `cuProfilerStop` before destroying the context to flush profiler cache.
This is necessary to get accurate traces from `nvprof` / `nvcc`.
Otherwise, we lose some profiling information.

Differential Revision: https://reviews.llvm.org/D35940

llvm-svn: 309682
2017-08-01 14:36:24 +00:00
Philipp Schaad 2f3073b5cb [Polly][GPGPU] Added SPIR Code Generation and Corresponding Runtime Support for Intel
Summary:
Added SPIR Code Generation to the PPCG Code Generator. This can be invoked using
the polly-gpu-arch flag value 'spir32' or 'spir64' for 32 and 64 bit code respectively.
In addition to that, runtime support has been added to execute said SPIR code on Intel
GPU's, where the system is equipped with Intel's open source driver Beignet (development
version). This requires the cmake flag 'USE_INTEL_OCL' to be turned on, and the polly-gpu-runtime
flag value to be 'libopencl'.
The transformation of LLVM IR to SPIR is currently quite a hack, consisting in part of regex
string transformations.
Has been tested (working) with Polybench 3.2 on an Intel i7-5500U (integrated graphics chip).

Reviewers: bollu, grosser, Meinersbur, singam-sanjay

Reviewed By: grosser, singam-sanjay

Subscribers: pollydev, nemanjai, mgorny, Anastasia, kbarton

Tags: #polly

Differential Revision: https://reviews.llvm.org/D35185

llvm-svn: 308751
2017-07-21 16:11:06 +00:00
Siddharth Bhat 8ac5340a4e [GPUJIT] Disabled gcc's -Wpedantic for use of dlsym
GCC's ISO C standard does not strictly define the bahavior of converting
a `void*` pointer to a function pointer, but dlsym's POSIX standard
does.

The retrieval of function pointers through dlsym in this case
generates an unnecessary amount of warnings for every API function
assignment, bloating the output.

This patch removes GCC's `-Wpedantic` flag for retrieval and assignment
of these functions. This simplifies debugging the output of GPUJIT.

Differential Revision: https://reviews.llvm.org/D33008

llvm-svn: 302638
2017-05-10 11:51:44 +00:00
Tobias Grosser 0f7ce83018 Add noreturn attribute to avoid warnings about missing initialization
Before this change we saw warnings such as:

  tools/GPURuntime/GPUJIT.c:1566:3:
  warning: variable 'DevPtr' is used uninitialized whenever switch default is
  taken [-Wsometimes-uninitialized]
    default:

llvm-svn: 302621
2017-05-10 05:20:56 +00:00
Siddharth Bhat a90be207c6 [Polly][PPCGCodeGen] OpenCL now gets kernel argument size from PPCG CodeGen
Summary: PPCGCodeGeneration now attaches the size of the kernel launch parameters at the end of the parameter list. For the existing CUDA Runtime, this gets ignored, but the OpenCL Runtime knows to check for kernel-argument size at the end of the parameter list. (The resulting parameters list is twice as long. This has been accounted for in the corresponding test cases).

Reviewers: grosser, Meinersbur, bollu

Reviewed By: bollu

Subscribers: nemanjai, yaxunl, Anastasia, pollydev, llvm-commits

Tags: #polly

Differential Revision: https://reviews.llvm.org/D32961

llvm-svn: 302515
2017-05-09 10:45:52 +00:00
Siddharth Bhat 0c8dcfd743 [Polly][GPUJIT] Fixed OpenCL 2.0 min requirement for Error codes
Summary: Removed OpenCL error code identifiers introduced in version 2.0.

Reviewers: grosser, bollu

Reviewed By: bollu

Subscribers: yaxunl, Anastasia, pollydev, llvm-commits

Tags: #polly

Differential Revision: https://reviews.llvm.org/D32962

llvm-svn: 302423
2017-05-08 14:10:37 +00:00
Siddharth Bhat 17f01968f1 [Polly] Added OpenCL Runtime to GPURuntime Library for GPGPU CodeGen
Summary:
When compiling for GPU, one can now choose to compile for OpenCL or CUDA,
with the corresponding polly-gpu-runtime flag (libopencl / libcudart). The
GPURuntime library (GPUJIT) has been extended with the OpenCL Runtime library
for that purpose, correctly choosing the corresponding library calls to the
option chosen when compiling (via different initialization calls).

Additionally, a specific GPU Target architecture can now be chosen with -polly-gpu-arch (only nvptx64 implemented thus far).

Reviewers: grosser, bollu, Meinersbur, etherzhhb, singam-sanjay

Reviewed By: grosser, Meinersbur

Subscribers: singam-sanjay, llvm-commits, pollydev, nemanjai, mgorny, yaxunl, Anastasia

Tags: #polly

Differential Revision: https://reviews.llvm.org/D32431

llvm-svn: 302379
2017-05-07 21:03:46 +00:00
Siddharth Bhat 5cf77125fc [Polly] [GPUJIT] Adapted argument capitalization to fit standard
Summary: Function argument naming changed to reflect capitalization standards.

Reviewers: grosser, Meinersbur

Reviewed By: grosser

Differential Revision: https://reviews.llvm.org/D32854

llvm-svn: 302376
2017-05-07 19:53:35 +00:00
Siddharth Bhat 448b8079cc [Polly] [GPUJIT] Moved error prints to stderr
Summary: Errors previously printed to stdout now get printed to stderr.

Reviewers: grosser, Meinersbur

Reviewed By: grosser

Differential Revision: https://reviews.llvm.org/D32852

llvm-svn: 302375
2017-05-07 18:31:25 +00:00
Siddharth Bhat c1267b9baa Revert "[Polly] Added OpenCL Runtime to GPURuntime Library for GPGPU CodeGen"
This reverts commit 17a84e414adb51ee375d14836d4c2a817b191933.

Patches should have been submitted in the order of:

1. D32852
2. D32854
3. D32431

I mistakenly pushed D32431(3) first. Reverting to push in the correct
order.

llvm-svn: 302217
2017-05-05 09:02:08 +00:00
Siddharth Bhat 51904ae35a [Polly] Added OpenCL Runtime to GPURuntime Library for GPGPU CodeGen
Summary:
When compiling for GPU, one can now choose to compile for OpenCL or CUDA,
with the corresponding polly-gpu-runtime flag (libopencl / libcudart). The
GPURuntime library (GPUJIT) has been extended with the OpenCL Runtime library
for that purpose, correctly choosing the corresponding library calls to the
option chosen when compiling (via different initialization calls).

Additionally, a specific GPU Target architecture can now be chosen with -polly-gpu-arch (only nvptx64 implemented thus far).

Reviewers: grosser, bollu, Meinersbur, etherzhhb, singam-sanjay

Reviewed By: grosser, Meinersbur

Subscribers: singam-sanjay, llvm-commits, pollydev, nemanjai, mgorny, yaxunl, Anastasia

Tags: #polly

Differential Revision: https://reviews.llvm.org/D32431

llvm-svn: 302215
2017-05-05 07:54:49 +00:00
Siddharth Bhat abed49699b [Polly] [PPCGCodeGeneration] Add managed memory support to GPU code
generation.

This needs changes to GPURuntime to expose synchronization between host
and device.

1. Needs better function naming, I want a better name than
"getOrCreateManagedDeviceArray"

2. DeviceAllocations is used by both the managed memory and the
non-managed memory path. This exploits the fact that the two code paths
are never run together. I'm not sure if this is the best design decision

Reviewed by: PhilippSchaad

Tags: #polly

Differential Revision: https://reviews.llvm.org/D32215

llvm-svn: 301640
2017-04-28 11:16:30 +00:00
Tobias Grosser b187515784 GPGPU: Cache PTX kernels
We always keep a number of already compiled kernels available to ensure to avoid
costly recompilation.

llvm-svn: 277707
2016-08-04 09:15:58 +00:00
Tobias Grosser 79a947c233 GPGPU: Add basic support for kernel launches
llvm-svn: 276863
2016-07-27 13:20:16 +00:00
Tobias Grosser 5779359624 GPGPU: Load GPU kernels
We embed the PTX code into the host IR as a global variable and compile it
at run-time into a GPU kernel.

llvm-svn: 276645
2016-07-25 16:31:21 +00:00
Tobias Grosser 13c78e4d51 GPGPU: Emit data-transfer code
Also factor out getArraySize() to avoid code dupliciation and reorder some
function arguments to indicate the direction into which data is transferred.

llvm-svn: 276636
2016-07-25 12:47:39 +00:00
Tobias Grosser 7287aeddf1 GPGPU: Complete code to allocate and free device arrays
At the beginning of each SCoP, we allocate device arrays for all arrays
used on the GPU and we free such arrays after the SCoP has been executed.

llvm-svn: 276635
2016-07-25 12:47:33 +00:00
Tobias Grosser 19b8a0bbfb GPURuntime: Add missing debug output
llvm-svn: 276634
2016-07-25 12:47:28 +00:00
Tobias Grosser a71eedd4c5 GPURuntime: Drop polly_cleanupGPGPUResources
This function is currently unused and won't be used in this form again. Instead
of freeing many unrelated items at the same time, we will instead explicitly
call free function from the host-IR we generate for each object we want to free.
These specific free functions will be added together with the corresponding
host-IR generation code.

llvm-svn: 276632
2016-07-25 12:47:22 +00:00
Tobias Grosser fa7b080218 GPGPU: initialize GPU context and simplify the corresponding GPURuntime interface.
There is no need to expose the selected device at the moment. We also pass back
pointers as return values, as this simplifies the interface.

llvm-svn: 276623
2016-07-25 09:16:01 +00:00
Tobias Grosser 0a1a2720c8 GPURuntime: Check for debug-mode early on
Before this change, the debug statements in polly_initDevice would all be
skipped, as debug-mode would only be enabled _after_ they have already been run.

llvm-svn: 276621
2016-07-25 09:15:53 +00:00
Tobias Grosser dc816da455 GPURuntime: Drop timing functionality (some leftover II)
llvm-svn: 276617
2016-07-25 08:03:08 +00:00
Tobias Grosser 92713bea42 GPURuntime: Drop timing functionality
This functionality won't be used in the current iteration. Drop it for now to
reduce the surface of the library. We can always add it back in when we need
it again.

llvm-svn: 276611
2016-07-25 07:10:45 +00:00
Tobias Grosser 91990ab3ac GPURuntime: Only print status in debug mode
This change moves all status messages that are printed in non-error mode behind
the POLLY_DEBUG flag.

llvm-svn: 274598
2016-07-06 03:04:53 +00:00
Tobias Grosser 856e31bb9c GPURuntime: Drop polly_allocateMemoryForHostAndDevice
There is function is currently unused and will be replaced in the future by
functions that allow to allocate memory only on the host or only on the device.

llvm-svn: 274597
2016-07-06 03:04:50 +00:00
Tobias Grosser a24d3ba26a GPURuntime: Add basic debug tracing infrastructure
When setting the POLLY_DEBUG environment variable, on calls to the run-time
library the name of the function called is printed to stderr.

llvm-svn: 274596
2016-07-06 03:04:47 +00:00
Tobias Grosser 114180db5b Also clang-format *.c run-time library files
llvm-svn: 262917
2016-03-08 07:34:58 +00:00
Tobias Grosser 1346663551 Fix formatting issues in banner
llvm-svn: 235867
2015-04-27 12:02:36 +00:00
Tobias Grosser 4d96c8d714 clang-format: Many more files
After this commit, polly is clang-format clean. This can be tested with
'ninja polly-check-format'. Updates to clang-format may change this, but the
differences will hopefully be both small and general improvements to the
formatting.

We currently have some not very nice formatting for a couple of items, DEBUG()
stmts for example. I believe the benefit of being clang-format clean outweights
the not perfect layout of this code.

llvm-svn: 177796
2013-03-23 01:05:07 +00:00
Tobias Grosser 903c242662 Update libGPURuntime to be dual licensed under MIT and UIUC license.
Contributed by: Yabin Hu  <yabin.hwu@gmail.com>

llvm-svn: 159815
2012-07-06 10:40:15 +00:00
Tobias Grosser 5c0f6f3350 Replace CUDA data types with Polly's GPGPU data types.
Contributed by:  Yabin Hu  <yabin.hwu@gmail.com>

llvm-svn: 159725
2012-07-04 21:45:03 +00:00
Tobias Grosser fb4842ff95 Add the runtime library for GPGPU code generation.
Contributed by: Yabin Hu <yabin.hwu@gmail.com>

llvm-svn: 158304
2012-06-11 09:25:01 +00:00