In case, the current team is a serialized team (lwt), the frame information should be written to this data structure.
Before, nested serialized teams would overwrite the same task information.
Patch by Joachim Protze!
Differential Revision: https://reviews.llvm.org/D23310
llvm-svn: 281467
The comment already states, that this function should work similarly as __ompt_get_taskinfo.
The function only looked for lwt entries of the current team, but not when unrolling the parents. This fix aligns the implementation to __ompt_get_taskinfo.
The new test case creates a single theaded team (->lwt) and then a nested active team.
Before the innermost print_id(1) would deliver a different team then the outer print_id(0).
Patch by Joachim Protze!
Differential Revision: https://reviews.llvm.org/D23309
llvm-svn: 281466
The exit address is set when execution of a task is started and should be reset as soon as the execution is finished.
Especially for the asm implementation of __kmp_invoke_microtask, resetting in this call would be painfull, so reset just after the invokation.
The testcase shows the effect of this patch:
Before, the implicit barriers at the end of an implicit task would see an exit address for the implicit task.
This barrier is a task scheduling point. Thus, any explicit task scheduled there would see an exit, but no reenter address for the implicit task.
Patch by Joachim Protze!
Differential Revision: https://reviews.llvm.org/D23307
llvm-svn: 281465
The latest OMPT spec changed the semantic of a tasks reenter frame to be the application frame, that will be entered, when the runtime frame drops.
Before it was the last frame in the runtime. This doesn't work for some gcc execution pathes or even clang generated code for :
Since there is no runtime frame between the executed task and the encountering task.
The test case compares exit and reenter addresses against addresses captured in application code
Patch by Joachim Protze!
Differential Revision: https://reviews.llvm.org/D23305
llvm-svn: 281464
OMPT tests can check for right frame information of tasks:
* parent_task_frame was directly printed as a pointer, but actually points to a struct ompt_frame {void*, void*}
* NULL is printed in the beginning of execution and loaded to FileChecker variable [[NULL]]
* implicit tasks now also print their frame information
* macro to print frame address from application
* print task info for barrier begin
Patch by Joachim Protze!
Differential Revision: https://reviews.llvm.org/D23304
llvm-svn: 281463
Support overriding LLVM_* variables obtained from llvm-config when doing
stand-alone builds. The override of LLVM_MAIN_SRC_DIR is necessary to
provide LLVM sources when the initial directory used to build LLVM does
no longer exist when compiler-rt is built stand-alone. This is
especially the case when building the projects separately in temporary
directories with unpredictable names.
The code is based on existing CMakeLists.txt from clang. Alike clang, it
extends the override to all queried variables.
Differential Revision: https://reviews.llvm.org/D24005
llvm-svn: 281461
Summary:
This is needed for the recently submitted misc-use-after-move check (rL281453).
For some reason, this still built under Linux, but it caused the PPC build bot
to fail.
Subscribers: beanz, cfe-commits, mgorny
Differential Revision: https://reviews.llvm.org/D24561
llvm-svn: 281460
--section-start=sectionname=org
Locate a section in the output file at the absolute address given by org.
You may use this option as many times as necessary to locate multiple sections in the command line.
org must be a single hexadecimal integer; for compatibility with other linkers,
you may omit the leading `0x' usually associated with hexadecimal values.
Note: there should be no white space between sectionname, the equals sign (“<=>”), and org.
-Tbss=org
-Tdata=org
-Ttext=org
Same as --section-start, with .bss, .data or .text as the sectionname.
Differential revision: https://reviews.llvm.org/D24294
llvm-svn: 281458
Summary:
Extend `tooling::Replacements::add()` to support adding order-independent replacements.
Two replacements are considered order-independent if one of the following conditions is true:
- They do not overlap. (This is already supported.)
- One replacement is insertion, and the other is a replacement with
length > 0, and the insertion is adjecent to but not contained in the
other replacement. In this case, the replacement should always change
the original code instead of the inserted text.
Reviewers: klimek, djasper
Subscribers: cfe-commits, klimek
Differential Revision: https://reviews.llvm.org/D24515
llvm-svn: 281457
Having both rename-at and rename-all both seems confusing and introduces
unneeded difficulties. Allowing to use both -qualified-name and -offset at once
while performing efficient renamings seems like a feature, too. Maintaining main
function wrappers and custom help becomes redundant while CLI becomes less
confusing.
Reviewers: alexfh
Differential Revision: https://reviews.llvm.org/D24224
llvm-svn: 281456
Summary:
The check warns if an object is used after it has been moved, without an
intervening reinitialization.
See user-facing documentation for details.
Reviewers: sbenza, Prazek, alexfh
Subscribers: beanz, mgorny, shadeware, omtcyfz, Eugene.Zelenko, Prazek, fowles, ioeric, cfe-commits
Differential Revision: https://reviews.llvm.org/D23353
llvm-svn: 281453
This is the fourth patch to apply the BLIS matmul optimization pattern on matmul
kernels (http://www.cs.utexas.edu/users/flame/pubs/TOMS-BLIS-Analytical.pdf).
BLIS implements gemm as three nested loops around a macro-kernel, plus two
packing routines. The macro-kernel is implemented in terms of two additional
loops around a micro-kernel. The micro-kernel is a loop around a rank-1
(i.e., outer product) update. In this change we perform copying to created
arrays, which is the last step to implement the packing transformation.
Reviewed-by: Tobias Grosser <tobias@grosser.es>
Differential Revision: https://reviews.llvm.org/D23260
llvm-svn: 281441
This patch adds an entry for "-rtlib" in the output of `man clang` and `clang -help`.
Patch by Lei Zhang!
Differential Revision: https://reviews.llvm.org/D24069
llvm-svn: 281440
value is a pointer.
This patch is to fix PR30213. When expanding an expr based on ValueOffsetPair,
if the value is of pointer type, we can only create a getelementptr instead
of sub expr.
Differential Revision: https://reviews.llvm.org/D24088
llvm-svn: 281439
This change ensures all necessary symbols are resolved correctly. Before this
change on some systems, the linker may have eliminated some symbols not directly
used in bugpoint, but used in Polly.
Suggested-by: Michael Kruse <lvm@meinersbur.de>
llvm-svn: 281438
Previously, all input files were owned by the symbol table.
Files were created at various places, such as the Driver, the lazy
symbols, or the bitcode compiler, and the ownership of new files
was transferred to the symbol table using std::unique_ptr.
All input files were then free'd when the symbol table is freed
which is on program exit.
I think we don't have to transfer ownership just to free all
instance at once on exit.
In this patch, all instances are automatically collected to a
vector and freed on exit. In this way, we no longer have to
use std::unique_ptr.
Differential Revision: https://reviews.llvm.org/D24493
llvm-svn: 281425
Summary:
We were packing global device memory handles in
`PackedKernelArgumentArray`, but as I was implementing the CUDA
platform, I realized that CUDA wants the address of the handle, not the
handle itself. So this patch switches to packing the address of the
handle.
Reviewers: jlebar
Subscribers: jprice, jlebar, parallel_libs-commits
Differential Revision: https://reviews.llvm.org/D24528
llvm-svn: 281424
Summary:
Platforms were returning Device pointers, but a Device is now basically
just a pointer to an underlying PlatformDevice, so we will now just pass
it around as a value.
Reviewers: jlebar
Subscribers: jprice, jlebar, parallel_libs-commits
Differential Revision: https://reviews.llvm.org/D24537
llvm-svn: 281422
VS 2015 and higher begin making use of c++14 in their standard
library headers. As such, -std=c++11 makes it so you can't compile
trivial programs. Bump this to -std=c++14 when this situation is
detected.
llvm-svn: 281420
ObjC library call with call return.
ARC contraction tries to replace uses of an argument passed to an
objective-c library call with the call return value. For example, in the
following IR, it replaces uses of argument %9 and uses of the values
discovered traversing the chain upwards (%7 and %8) with the call return
%10, if they are dominated by the call to @objc_autoreleaseReturnValue.
This transformation enables code-gen to tail-call the call to
@objc_autoreleaseReturnValue, which is necessary to enable auto release
return value optimization.
%7 = tail call i8* @objc_loadWeakRetained(i8** %6)
%8 = bitcast i8* %7 to %0*
%9 = bitcast %0* %8 to i8*
%10 = tail call i8* @objc_autoreleaseReturnValue(i8* %9)
ret %0* %8
Since r276727, llvm started removing redundant bitcasts and as a result
started feeding the following IR to ARC contraction:
%7 = tail call i8* @objc_loadWeakRetained(i8** %6)
%8 = bitcast i8* %7 to %0*
%9 = tail call i8* @objc_autoreleaseReturnValue(i8* %7)
ret %0* %8
ARC contraction no longer does the optimization described above since it
only traverses the chain upwards and fails to recognize that the
function return can be replaced by the call return. This commit changes
ARC contraction to traverse the chain downwards too and replace uses of
bitcasts with the call return.
rdar://problem/28011339
Differential Revision: https://reviews.llvm.org/D24523
llvm-svn: 281419
using to enqueue all the jobs wasn't enough time on a slow/overloaded
system. Instead use a global to indicate when all the work has
been enqueued, let's see if this makes the CIs work more reliably.
llvm-svn: 281418