Turning on `enable_noundef_analysis` flag allows better codegen by removing freeze instructions.
I modified clang by renaming `enable_noundef_analysis` flag to `disable-noundef-analysis` and turning it off by default.
Test updates are made as a separate patch: D108453
Reviewed By: eugenis
Differential Revision: https://reviews.llvm.org/D105169
Turning on `enable_noundef_analysis` flag allows better codegen by removing freeze instructions.
I modified clang by renaming `enable_noundef_analysis` flag to `disable-noundef-analysis` and turning it off by default.
Test updates are made as a separate patch: D108453
Reviewed By: eugenis
Differential Revision: https://reviews.llvm.org/D105169
[Clang/Test]: Rename enable_noundef_analysis to disable-noundef-analysis and turn it off by default (2)
This patch updates test files after D105169.
Autogenerated test codes are changed by `utils/update_cc_test_checks.py,` and non-autogenerated test codes are changed as follows:
(1) I wrote a python script that (partially) updates the tests using regex: {F18594904} The script is not perfect, but I believe it gives hints about which patterns are updated to have `noundef` attached.
(2) The remaining tests are updated manually.
Reviewed By: eugenis
Differential Revision: https://reviews.llvm.org/D108453
Resolve lit failures in clang after 8ca4b3e's land
Fix lit test failures in clang-ppc* and clang-x64-windows-msvc
Fix missing failures in clang-ppc64be* and retry fixing clang-x64-windows-msvc
Fix internal_clone(aarch64) inline assembly
Turning on `enable_noundef_analysis` flag allows better codegen by removing freeze instructions.
I modified clang by renaming `enable_noundef_analysis` flag to `disable-noundef-analysis` and turning it off by default.
Test updates are made as a separate patch: D108453
Reviewed By: eugenis
Differential Revision: https://reviews.llvm.org/D105169
This patch updates test files after D105169.
Autogenerated test codes are changed by `utils/update_cc_test_checks.py,` and non-autogenerated test codes are changed as follows:
(1) I wrote a python script that (partially) updates the tests using regex: {F18594904} The script is not perfect, but I believe it gives hints about which patterns are updated to have `noundef` attached.
(2) The remaining tests are updated manually.
Reviewed By: eugenis
Differential Revision: https://reviews.llvm.org/D108453
Parallel regions are outlined as functions with capture variables explicitly generated as distinct parameters in the function's argument list. That complicates the fork_call interface in the OpenMP runtime: (1) the fork_call is variadic since there is a variable number of arguments to forward to the outlined function, (2) wrapping/unwrapping arguments happens in the OpenMP runtime, which is sub-optimal, has been a source of ABI bugs, and has a hardcoded limit (16) in the number of arguments, (3) forwarded arguments must cast to pointer types, which complicates debugging. This patch avoids those issues by aggregating captured arguments in a struct to pass to the fork_call.
Reviewed By: jdoerfert, jhuber6
Differential Revision: https://reviews.llvm.org/D102107
Parallel regions are outlined as functions with capture variables explicitly generated as distinct parameters in the function's argument list. That complicates the fork_call interface in the OpenMP runtime: (1) the fork_call is variadic since there is a variable number of arguments to forward to the outlined function, (2) wrapping/unwrapping arguments happens in the OpenMP runtime, which is sub-optimal, has been a source of ABI bugs, and has a hardcoded limit (16) in the number of arguments, (3) forwarded arguments must cast to pointer types, which complicates debugging. This patch avoids those issues by aggregating captured arguments in a struct to pass to the fork_call.
Reviewed By: jdoerfert
Differential Revision: https://reviews.llvm.org/D102107
This patch adds a module level metadata flag indicating that the module
was compiled with the `-fopenmp` flag. This will make it easier for
passes like OpenMPOpt to determine if it should be run.
Reviewed By: jdoerfert
Differential Revision: https://reviews.llvm.org/D102361
Refactored code of dependence processing and added new inoutset dependence type.
Compiler can set dependence flag to 0x8 when call __kmpc_omp_task_with_deps.
Size of type of the dependence flag changed from 1 to 4 bytes in clang.
All dependence flags library gets so far and corresponding dependence types:
1 - IN, 2 - OUT, 3 - INOUT, 4 - MUTEXINOUTSET, 8 - INOUTSET.
Differential Revision: https://reviews.llvm.org/D97085
If a test does not contain an " simd" but -fopenmp-simd RUN lines we can
just check that we do not create __kmpc|__tgt calls.
Reviewed By: ABataev
Differential Revision: https://reviews.llvm.org/D101973
This patch refactors a subset of Clang OpenMP tests, generating checklines using the update_cc_test_checks script. This refactoring facilitates updating the Clang OpenMP code generation codebase by automating test generation.
Reviewed By: jdoerfert
Differential Revision: https://reviews.llvm.org/D101849
This patch refactors a subset of Clang OpenMP tests, generating checklines using the update_cc_test_checks script. This refactoring facilitates updating the Clang OpenMP code generation codebase by automating test generation.
Reviewed By: jdoerfert
Differential Revision: https://reviews.llvm.org/D101849
For a definition (of most linkage types), dso_local is set for ELF -fno-pic/-fpie
and COFF, but not for Mach-O. This nuance causes unneeded binary format differences.
This patch replaces (function) `define ` with `define{{.*}} `,
(variable/constant/alias) `= ` with `={{.*}} `, or inserts appropriate `{{.*}} `
if there is an explicit linkage.
* Clang will set dso_local for Mach-O, which is currently implied by TargetMachine.cpp. This will make COFF/Mach-O and executable ELF similar.
* Eventually I hope we can make dso_local the textual LLVM IR default (write explicit "dso_preemptable" when applicable) and -fpic ELF will be similar to everything else. This patch helps move toward that goal.
Many OpenMP Clang tests do not RUN for version 4.5 and the default
version. This first patch in the series only handles test cases
which do not require any modifications in the CHECK lines after
adding RUN lines for default version.
Reviewed By: ABataev
Differential Revision: https://reviews.llvm.org/D84844
only.
Added support for -fopenmp-simd option that allows compilation of
simd-based constructs without emission of OpenMP runtime calls.
llvm-svn: 321560
Directive name modifiers in 'if' clause are allowed only for OpenMP 4.5
and higher + in OpenMP 4.5 parsing procedure emits error message if ':'
is not found after directive name modifier.
llvm-svn: 290175
directives.
'kmp_task_t' record type added a new field for 'priority' clause and
changed the representation of pointer to destructors for privates used
within loop-based directives.
Old representation:
typedef struct kmp_task { /* GEH: Shouldn't this be
aligned somehow? */
void *shareds; /**< pointer to block of
pointers to shared vars */
kmp_routine_entry_t routine; /**< pointer to routine
to call for executing task */
kmp_int32 part_id; /**< part id for the
task */
kmp_routine_entry_t destructors; /* pointer to function to
invoke deconstructors of firstprivate C++ objects */
/* private vars */
} kmp_task_t;
New representation:
typedef struct kmp_task { /* GEH: Shouldn't this be
aligned somehow? */
void *shareds; /**< pointer to block of
pointers to shared vars */
kmp_routine_entry_t routine; /**< pointer to routine
to call for executing task */
kmp_int32 part_id; /**< part id for the
task */
kmp_cmplrdata_t data1; /* Two known
optional additions: destructors and priority */
kmp_cmplrdata_t data2; /* Process
destructors first, priority second */
/* future data */
/* private vars */
} kmp_task_t;
Also excessive initialization of 'destructors' fields to 'null' was
removed from codegen if it is known that no destructors shal be used.
Currently a special bit is used in 'kmp_tasking_flags_t' bitfields
('destructors_thunk' bitfield).
llvm-svn: 271201
Currently all variables used in OpenMP regions are captured into a record and passed to outlined functions in this record. It may result in some poor performance because of too complex analysis later in optimization passes. Patch makes to emit outlined functions for parallel-based regions with a list of captured variables. It reduces code for 2*n GEPs, stores and loads at least.
Codegen for task-based regions remains unchanged because runtime requires that all captured variables are passed in captured record.
llvm-svn: 247251
Fixed codegen for extended format of 'if' clauses with special 'directive-name-modifier' + ast-print tests for extended format of 'if' clause.
llvm-svn: 246748
If task directive has associated 'depend' clause then function kmp_int32 __kmpc_omp_task_with_deps ( ident_t *loc_ref, kmp_int32 gtid, kmp_task_t * new_task, kmp_int32 ndeps, kmp_depend_info_t *dep_list,kmp_int32 ndeps_noalias, kmp_depend_info_t *noalias_dep_list) must be called instead of __kmpc_omp_task().
If this directive has associated 'if' clause then also before a call of kmpc_omp_task_begin_if0() a function void __kmpc_omp_wait_deps ( ident_t *loc_ref, kmp_int32 gtid, kmp_int32 ndeps, kmp_depend_info_t *dep_list, kmp_int32 ndeps_noalias, kmp_depend_info_t *noalias_dep_list) must be called.
Array sections are not supported yet.
llvm-svn: 240532
-fopenmp turns on OpenMP support and links libiomp5 as OpenMP library. Also there is -fopenmp={libiomp5|libgomp} option that allows to override effect of -fopenmp and link libgomp library (if -fopenmp=libgomp is specified).
Differential Revision: http://reviews.llvm.org/D9736
llvm-svn: 237769
If condition evaluates to true, the code executes task by calling @__kmpc_omp_task() runtime function.
If condition evaluates to false, the code executes serial version of the code by executing the following code:
call void @__kmpc_omp_task_begin_if0(<loc>, <threadid>, <task_t_ptr, returned by @__kmpc_omp_task_alloc()>);
proxy_task_entry(<gtid>, <task_t_ptr, returned by @__kmpc_omp_task_alloc()>);
call void @__kmpc_omp_task_complete_if0(<loc>, <threadid>, <task_t_ptr, returned by @__kmpc_omp_task_alloc()>);
Also it checks if the condition is constant and if it is constant it evaluates its value and then generates either parallel version of the code (if the condition evaluates to true), or the serial version of the code (if the condition evaluates to false).
Differential Revision: http://reviews.llvm.org/D9143
llvm-svn: 235507