forked from lijiext/lammps
Update Kokkos library to r2.6.00
This commit is contained in:
parent
0c4c002f34
commit
39786b1740
|
@ -1,5 +1,49 @@
|
|||
# Change Log
|
||||
|
||||
## [2.6.00](https://github.com/kokkos/kokkos/tree/2.6.00) (2018-03-07)
|
||||
[Full Changelog](https://github.com/kokkos/kokkos/compare/2.5.00...2.6.00)
|
||||
|
||||
**Part of the Kokkos C++ Performance Portability Programming EcoSystem 2.6**
|
||||
|
||||
**Implemented enhancements:**
|
||||
|
||||
- Support NVIDIA Volta microarchitecture [\#1466](https://github.com/kokkos/kokkos/issues/1466)
|
||||
- Kokkos - Define empty functions when profiling disabled [\#1424](https://github.com/kokkos/kokkos/issues/1424)
|
||||
- Don't use \_\_constant\_\_ cache for lock arrays, enable once per run update instead of once per call [\#1385](https://github.com/kokkos/kokkos/issues/1385)
|
||||
- task dag enhancement. [\#1354](https://github.com/kokkos/kokkos/issues/1354)
|
||||
- Cuda task team collectives and stack size [\#1353](https://github.com/kokkos/kokkos/issues/1353)
|
||||
- Replace View operator acceptance of more than rank integers with 'access' function [\#1333](https://github.com/kokkos/kokkos/issues/1333)
|
||||
- Interoperability: Do not shut down backend execution space runtimes upon calling finalize. [\#1305](https://github.com/kokkos/kokkos/issues/1305)
|
||||
- shmem\_size for LayoutStride [\#1291](https://github.com/kokkos/kokkos/issues/1291)
|
||||
- Kokkos::resize performs poorly on 1D Views [\#1270](https://github.com/kokkos/kokkos/issues/1270)
|
||||
- stride\(\) is inconsistent with dimension\(\), extent\(\), etc. [\#1214](https://github.com/kokkos/kokkos/issues/1214)
|
||||
- Kokkos::sort defaults to std::sort on host [\#1208](https://github.com/kokkos/kokkos/issues/1208)
|
||||
- DynamicView with host size grow [\#1206](https://github.com/kokkos/kokkos/issues/1206)
|
||||
- Unmanaged View with Anonymous Memory Space [\#1175](https://github.com/kokkos/kokkos/issues/1175)
|
||||
- Sort subset of Kokkos::DynamicView [\#1160](https://github.com/kokkos/kokkos/issues/1160)
|
||||
- MDRange policy doesn't support lambda reductions [\#1054](https://github.com/kokkos/kokkos/issues/1054)
|
||||
- Add ability to set hook on Kokkos::finalize [\#714](https://github.com/kokkos/kokkos/issues/714)
|
||||
- Atomics with Serial Backend - Default should be Disable? [\#549](https://github.com/kokkos/kokkos/issues/549)
|
||||
- KOKKOS\_ENABLE\_DEPRECATED\_CODE [\#1359](https://github.com/kokkos/kokkos/issues/1359)
|
||||
|
||||
**Fixed bugs:**
|
||||
|
||||
- cuda\_internal\_maximum\_warp\_count returns 8, but I believe it should return 16 for P100 [\#1269](https://github.com/kokkos/kokkos/issues/1269)
|
||||
- Cuda: level 1 scratch memory bug \(reported by Stan Moore\) [\#1434](https://github.com/kokkos/kokkos/issues/1434)
|
||||
- MDRangePolicy Reduction requires value\_type typedef in Functor [\#1379](https://github.com/kokkos/kokkos/issues/1379)
|
||||
- Kokkos DeepCopy between empty views fails [\#1369](https://github.com/kokkos/kokkos/issues/1369)
|
||||
- Several issues with new CMake build infrastructure \(reported by Eric Phipps\) [\#1365](https://github.com/kokkos/kokkos/issues/1365)
|
||||
- deep\_copy between rank-1 host/device views of differing layouts without UVM no longer works \(reported by Eric Phipps\) [\#1363](https://github.com/kokkos/kokkos/issues/1363)
|
||||
- Profiling can't be disabled in CMake, and a parallel\_for is missing for tasks \(reported by Kyungjoo Kim\) [\#1349](https://github.com/kokkos/kokkos/issues/1349)
|
||||
- get\_work\_partition int overflow \(reported by berryj5\) [\#1327](https://github.com/kokkos/kokkos/issues/1327)
|
||||
- Kokkos::deep\_copy must fence even if the two views are the same [\#1303](https://github.com/kokkos/kokkos/issues/1303)
|
||||
- CudaUVMSpace::allocate/deallocate must fence [\#1302](https://github.com/kokkos/kokkos/issues/1302)
|
||||
- ViewResize on CUDA fails in Debug because of too many resources requested [\#1299](https://github.com/kokkos/kokkos/issues/1299)
|
||||
- Cuda 9 and intrepid2 calls from Panzer. [\#1183](https://github.com/kokkos/kokkos/issues/1183)
|
||||
- Slowdown due to tracking\_enabled\(\) in 2.04.00 \(found by Albany app\) [\#1016](https://github.com/kokkos/kokkos/issues/1016)
|
||||
- Bounds checking fails with zero-span Views \(reported by Stan Moore\) [\#1411](https://github.com/kokkos/kokkos/issues/1411)
|
||||
|
||||
|
||||
## [2.5.00](https://github.com/kokkos/kokkos/tree/2.5.00) (2017-12-15)
|
||||
[Full Changelog](https://github.com/kokkos/kokkos/compare/2.04.11...2.5.00)
|
||||
|
||||
|
|
|
@ -7,7 +7,7 @@ ELSE()
|
|||
ENDIF()
|
||||
|
||||
IF(NOT KOKKOS_HAS_TRILINOS)
|
||||
cmake_minimum_required(VERSION 3.1 FATAL_ERROR)
|
||||
cmake_minimum_required(VERSION 3.3 FATAL_ERROR)
|
||||
|
||||
# Define Project Name if this is a standalone build
|
||||
IF(NOT DEFINED ${PROJECT_NAME})
|
||||
|
@ -37,9 +37,19 @@ IF(NOT KOKKOS_HAS_TRILINOS)
|
|||
COMMAND ${KOKKOS_SETTINGS} make -f ${KOKKOS_SRC_PATH}/cmake/Makefile.generate_cmake_settings CXX=${CMAKE_CXX_COMPILER} generate_build_settings
|
||||
WORKING_DIRECTORY "${Kokkos_BINARY_DIR}"
|
||||
OUTPUT_FILE ${Kokkos_BINARY_DIR}/core_src_make.out
|
||||
RESULT_VARIABLE res
|
||||
RESULT_VARIABLE GEN_SETTINGS_RESULT
|
||||
)
|
||||
if (GEN_SETTINGS_RESULT)
|
||||
message(FATAL_ERROR "Kokkos settings generation failed:\n"
|
||||
"${KOKKOS_SETTINGS} make -f ${KOKKOS_SRC_PATH}/cmake/Makefile.generate_cmake_settings CXX=${CMAKE_CXX_COMPILER} generate_build_settings")
|
||||
endif()
|
||||
include(${Kokkos_BINARY_DIR}/kokkos_generated_settings.cmake)
|
||||
string(REPLACE " " ";" KOKKOS_TPL_INCLUDE_DIRS "${KOKKOS_GMAKE_TPL_INCLUDE_DIRS}")
|
||||
string(REPLACE " " ";" KOKKOS_TPL_LIBRARY_DIRS "${KOKKOS_GMAKE_TPL_LIBRARY_DIRS}")
|
||||
string(REPLACE " " ";" KOKKOS_TPL_LIBRARY_NAMES "${KOKKOS_GMAKE_TPL_LIBRARY_NAMES}")
|
||||
list(REMOVE_ITEM KOKKOS_TPL_INCLUDE_DIRS "")
|
||||
list(REMOVE_ITEM KOKKOS_TPL_LIBRARY_DIRS "")
|
||||
list(REMOVE_ITEM KOKKOS_TPL_LIBRARY_NAMES "")
|
||||
set_kokkos_srcs(KOKKOS_SRC ${KOKKOS_SRC})
|
||||
|
||||
#------------ NOW BUILD ------------------------------------------------------
|
||||
|
|
|
@ -34,7 +34,7 @@
|
|||
// NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE OF THIS
|
||||
// SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.
|
||||
//
|
||||
// Questions? Contact H. Carter Edwards (hcedwar@sandia.gov)
|
||||
// Questions? Contact Christian R. Trott (crtrott@sandia.gov)
|
||||
//
|
||||
// ************************************************************************
|
||||
//@HEADER
|
||||
|
|
|
@ -19,7 +19,7 @@ snapshot Kokkos from github.com/kokkos to Trilinos.
|
|||
|
||||
3) Snapshot the current commit in the Kokkos clone into the Trilinos clone.
|
||||
This overwrites ${TRILINOS}/packages/kokkos with the content of ${KOKKOS}:
|
||||
${KOKKOS}/config/snapshot.py --verbose ${KOKKOS} ${TRILINOS}/packages
|
||||
${KOKKOS}/scripts/snapshot.py --verbose ${KOKKOS} ${TRILINOS}/packages
|
||||
|
||||
4) Verify the snapshot commit happened as expected
|
||||
cd ${TRILINOS}/packages/kokkos
|
||||
|
|
|
@ -36,7 +36,7 @@
|
|||
// NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE OF THIS
|
||||
// SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.
|
||||
//
|
||||
// Questions? Contact H. Carter Edwards (hcedwar@sandia.gov)
|
||||
// Questions? Contact Christian R. Trott (crtrott@sandia.gov)
|
||||
//
|
||||
// ************************************************************************
|
||||
//@HEADER
|
||||
|
|
|
@ -9,8 +9,8 @@ KOKKOS_DEVICES ?= "OpenMP"
|
|||
#KOKKOS_DEVICES ?= "Pthreads"
|
||||
# Options:
|
||||
# Intel: KNC,KNL,SNB,HSW,BDW,SKX
|
||||
# NVIDIA: Kepler,Kepler30,Kepler32,Kepler35,Kepler37,Maxwell,Maxwell50,Maxwell52,Maxwell53,Pascal60,Pascal61
|
||||
# ARM: ARMv80,ARMv81,ARMv8-ThunderX
|
||||
# NVIDIA: Kepler,Kepler30,Kepler32,Kepler35,Kepler37,Maxwell,Maxwell50,Maxwell52,Maxwell53,Pascal60,Pascal61,Volta70,Volta72
|
||||
# ARM: ARMv80,ARMv81,ARMv8-ThunderX,ARMv8-TX2
|
||||
# IBM: BGQ,Power7,Power8,Power9
|
||||
# AMD-GPUS: Kaveri,Carrizo,Fiji,Vega
|
||||
# AMD-CPUS: AMDAVX,Ryzen,Epyc
|
||||
|
@ -21,7 +21,7 @@ KOKKOS_DEBUG ?= "no"
|
|||
KOKKOS_USE_TPLS ?= ""
|
||||
# Options: c++11,c++1z
|
||||
KOKKOS_CXX_STANDARD ?= "c++11"
|
||||
# Options: aggressive_vectorization,disable_profiling
|
||||
# Options: aggressive_vectorization,disable_profiling,disable_deprecated_code
|
||||
KOKKOS_OPTIONS ?= ""
|
||||
|
||||
# Default settings specific options.
|
||||
|
@ -48,6 +48,7 @@ KOKKOS_INTERNAL_USE_MEMKIND := $(call kokkos_has_string,$(KOKKOS_USE_TPLS),exper
|
|||
KOKKOS_INTERNAL_ENABLE_COMPILER_WARNINGS := $(call kokkos_has_string,$(KOKKOS_OPTIONS),compiler_warnings)
|
||||
KOKKOS_INTERNAL_OPT_RANGE_AGGRESSIVE_VECTORIZATION := $(call kokkos_has_string,$(KOKKOS_OPTIONS),aggressive_vectorization)
|
||||
KOKKOS_INTERNAL_DISABLE_PROFILING := $(call kokkos_has_string,$(KOKKOS_OPTIONS),disable_profiling)
|
||||
KOKKOS_INTERNAL_DISABLE_DEPRECATED_CODE := $(call kokkos_has_string,$(KOKKOS_OPTIONS),disable_deprecated_code)
|
||||
KOKKOS_INTERNAL_DISABLE_DUALVIEW_MODIFY_CHECK := $(call kokkos_has_string,$(KOKKOS_OPTIONS),disable_dualview_modify_check)
|
||||
KOKKOS_INTERNAL_ENABLE_PROFILING_LOAD_PRINT := $(call kokkos_has_string,$(KOKKOS_OPTIONS),enable_profile_load_print)
|
||||
KOKKOS_INTERNAL_CUDA_USE_LDG := $(call kokkos_has_string,$(KOKKOS_CUDA_OPTIONS),use_ldg)
|
||||
|
@ -93,7 +94,7 @@ KOKKOS_INTERNAL_COMPILER_INTEL := $(call kokkos_has_string,$(KOKKOS_CXX_VE
|
|||
KOKKOS_INTERNAL_COMPILER_PGI := $(call kokkos_has_string,$(KOKKOS_CXX_VERSION),PGI)
|
||||
KOKKOS_INTERNAL_COMPILER_XL := $(strip $(shell $(CXX) -qversion 2>&1 | grep XL | wc -l))
|
||||
KOKKOS_INTERNAL_COMPILER_CRAY := $(strip $(shell $(CXX) -craype-verbose 2>&1 | grep "CC-" | wc -l))
|
||||
KOKKOS_INTERNAL_COMPILER_NVCC := $(strip $(shell export OMPI_CXX=$(OMPI_CXX); export MPICH_CXX=$(MPICH_CXX); $(CXX) --version 2>&1 | grep nvcc | wc -l))
|
||||
KOKKOS_INTERNAL_COMPILER_NVCC := $(strip $(shell export OMPI_CXX=$(OMPI_CXX); export MPICH_CXX=$(MPICH_CXX); $(CXX) --version 2>&1 | grep nvcc | wc -l))
|
||||
KOKKOS_INTERNAL_COMPILER_CLANG := $(call kokkos_has_string,$(KOKKOS_CXX_VERSION),clang)
|
||||
KOKKOS_INTERNAL_COMPILER_APPLE_CLANG := $(call kokkos_has_string,$(KOKKOS_CXX_VERSION),apple-darwin)
|
||||
KOKKOS_INTERNAL_COMPILER_HCC := $(call kokkos_has_string,$(KOKKOS_CXX_VERSION),HCC)
|
||||
|
@ -229,12 +230,16 @@ KOKKOS_INTERNAL_USE_ARCH_MAXWELL52 := $(call kokkos_has_string,$(KOKKOS_ARCH),Ma
|
|||
KOKKOS_INTERNAL_USE_ARCH_MAXWELL53 := $(call kokkos_has_string,$(KOKKOS_ARCH),Maxwell53)
|
||||
KOKKOS_INTERNAL_USE_ARCH_PASCAL61 := $(call kokkos_has_string,$(KOKKOS_ARCH),Pascal61)
|
||||
KOKKOS_INTERNAL_USE_ARCH_PASCAL60 := $(call kokkos_has_string,$(KOKKOS_ARCH),Pascal60)
|
||||
KOKKOS_INTERNAL_USE_ARCH_VOLTA70 := $(call kokkos_has_string,$(KOKKOS_ARCH),Volta70)
|
||||
KOKKOS_INTERNAL_USE_ARCH_VOLTA72 := $(call kokkos_has_string,$(KOKKOS_ARCH),Volta72)
|
||||
KOKKOS_INTERNAL_USE_ARCH_NVIDIA := $(shell expr $(KOKKOS_INTERNAL_USE_ARCH_KEPLER30) \
|
||||
+ $(KOKKOS_INTERNAL_USE_ARCH_KEPLER32) \
|
||||
+ $(KOKKOS_INTERNAL_USE_ARCH_KEPLER35) \
|
||||
+ $(KOKKOS_INTERNAL_USE_ARCH_KEPLER37) \
|
||||
+ $(KOKKOS_INTERNAL_USE_ARCH_PASCAL61) \
|
||||
+ $(KOKKOS_INTERNAL_USE_ARCH_PASCAL60) \
|
||||
+ $(KOKKOS_INTERNAL_USE_ARCH_VOLTA70) \
|
||||
+ $(KOKKOS_INTERNAL_USE_ARCH_VOLTA72) \
|
||||
+ $(KOKKOS_INTERNAL_USE_ARCH_MAXWELL50) \
|
||||
+ $(KOKKOS_INTERNAL_USE_ARCH_MAXWELL52) \
|
||||
+ $(KOKKOS_INTERNAL_USE_ARCH_MAXWELL53))
|
||||
|
@ -249,6 +254,8 @@ ifeq ($(KOKKOS_INTERNAL_USE_ARCH_NVIDIA), 0)
|
|||
+ $(KOKKOS_INTERNAL_USE_ARCH_KEPLER37) \
|
||||
+ $(KOKKOS_INTERNAL_USE_ARCH_PASCAL61) \
|
||||
+ $(KOKKOS_INTERNAL_USE_ARCH_PASCAL60) \
|
||||
+ $(KOKKOS_INTERNAL_USE_ARCH_VOLTA70) \
|
||||
+ $(KOKKOS_INTERNAL_USE_ARCH_VOLTA72) \
|
||||
+ $(KOKKOS_INTERNAL_USE_ARCH_MAXWELL50) \
|
||||
+ $(KOKKOS_INTERNAL_USE_ARCH_MAXWELL52) \
|
||||
+ $(KOKKOS_INTERNAL_USE_ARCH_MAXWELL53))
|
||||
|
@ -267,7 +274,8 @@ endif
|
|||
KOKKOS_INTERNAL_USE_ARCH_ARMV80 := $(call kokkos_has_string,$(KOKKOS_ARCH),ARMv80)
|
||||
KOKKOS_INTERNAL_USE_ARCH_ARMV81 := $(call kokkos_has_string,$(KOKKOS_ARCH),ARMv81)
|
||||
KOKKOS_INTERNAL_USE_ARCH_ARMV8_THUNDERX := $(call kokkos_has_string,$(KOKKOS_ARCH),ARMv8-ThunderX)
|
||||
KOKKOS_INTERNAL_USE_ARCH_ARM := $(strip $(shell echo $(KOKKOS_INTERNAL_USE_ARCH_ARMV80)+$(KOKKOS_INTERNAL_USE_ARCH_ARMV81)+$(KOKKOS_INTERNAL_USE_ARCH_ARMV8_THUNDERX) | bc))
|
||||
KOKKOS_INTERNAL_USE_ARCH_ARMV8_THUNDERX2 := $(call kokkos_has_string,$(KOKKOS_ARCH),ARMv8-TX2)
|
||||
KOKKOS_INTERNAL_USE_ARCH_ARM := $(strip $(shell echo $(KOKKOS_INTERNAL_USE_ARCH_ARMV80)+$(KOKKOS_INTERNAL_USE_ARCH_ARMV81)+$(KOKKOS_INTERNAL_USE_ARCH_ARMV8_THUNDERX)+$(KOKKOS_INTERNAL_USE_ARCH_ARMV8_THUNDERX2) | bc))
|
||||
|
||||
# IBM based.
|
||||
KOKKOS_INTERNAL_USE_ARCH_BGQ := $(call kokkos_has_string,$(KOKKOS_ARCH),BGQ)
|
||||
|
@ -316,6 +324,9 @@ endif
|
|||
# Generating the list of Flags.
|
||||
|
||||
KOKKOS_CPPFLAGS = -I./ -I$(KOKKOS_PATH)/core/src -I$(KOKKOS_PATH)/containers/src -I$(KOKKOS_PATH)/algorithms/src
|
||||
KOKKOS_TPL_INCLUDE_DIRS =
|
||||
KOKKOS_TPL_LIBRARY_DIRS =
|
||||
KOKKOS_TPL_LIBRARY_NAMES =
|
||||
|
||||
KOKKOS_CXXFLAGS =
|
||||
ifeq ($(KOKKOS_INTERNAL_ENABLE_COMPILER_WARNINGS), 1)
|
||||
|
@ -323,7 +334,9 @@ ifeq ($(KOKKOS_INTERNAL_ENABLE_COMPILER_WARNINGS), 1)
|
|||
endif
|
||||
|
||||
KOKKOS_LIBS = -ldl
|
||||
KOKKOS_TPL_LIBRARY_NAMES += dl
|
||||
KOKKOS_LDFLAGS = -L$(shell pwd)
|
||||
KOKKOS_LINK_FLAGS =
|
||||
KOKKOS_SRC =
|
||||
KOKKOS_HEADERS =
|
||||
|
||||
|
@ -437,21 +450,32 @@ ifeq ($(KOKKOS_INTERNAL_ENABLE_PROFILING_LOAD_PRINT), 1)
|
|||
endif
|
||||
|
||||
ifeq ($(KOKKOS_INTERNAL_USE_HWLOC), 1)
|
||||
KOKKOS_CPPFLAGS += -I$(HWLOC_PATH)/include
|
||||
KOKKOS_LDFLAGS += -L$(HWLOC_PATH)/lib
|
||||
ifneq ($(HWLOC_PATH),)
|
||||
KOKKOS_CPPFLAGS += -I$(HWLOC_PATH)/include
|
||||
KOKKOS_LDFLAGS += -L$(HWLOC_PATH)/lib
|
||||
KOKKOS_TPL_INCLUDE_DIRS += $(HWLOC_PATH)/include
|
||||
KOKKOS_TPL_LIBRARY_DIRS += $(HWLOC_PATH)/lib
|
||||
endif
|
||||
KOKKOS_LIBS += -lhwloc
|
||||
KOKKOS_TPL_LIBRARY_NAMES += hwloc
|
||||
tmp := $(call kokkos_append_header,"\#define KOKKOS_HAVE_HWLOC")
|
||||
endif
|
||||
|
||||
ifeq ($(KOKKOS_INTERNAL_USE_LIBRT), 1)
|
||||
tmp := $(call kokkos_append_header,"\#define KOKKOS_USE_LIBRT")
|
||||
KOKKOS_LIBS += -lrt
|
||||
KOKKOS_TPL_LIBRARY_NAMES += rt
|
||||
endif
|
||||
|
||||
ifeq ($(KOKKOS_INTERNAL_USE_MEMKIND), 1)
|
||||
KOKKOS_CPPFLAGS += -I$(MEMKIND_PATH)/include
|
||||
KOKKOS_LDFLAGS += -L$(MEMKIND_PATH)/lib
|
||||
ifneq ($(MEMKIND_PATH),)
|
||||
KOKKOS_CPPFLAGS += -I$(MEMKIND_PATH)/include
|
||||
KOKKOS_LDFLAGS += -L$(MEMKIND_PATH)/lib
|
||||
KOKKOS_TPL_INCLUDE_DIRS += $(MEMKIND_PATH)/include
|
||||
KOKKOS_TPL_LIBRARY_DIRS += $(MEMKIND_PATH)/lib
|
||||
endif
|
||||
KOKKOS_LIBS += -lmemkind -lnuma
|
||||
KOKKOS_TPL_LIBRARY_NAMES += memkind numa
|
||||
tmp := $(call kokkos_append_header,"\#define KOKKOS_HAVE_HBWSPACE")
|
||||
endif
|
||||
|
||||
|
@ -459,6 +483,10 @@ ifeq ($(KOKKOS_INTERNAL_DISABLE_PROFILING), 0)
|
|||
tmp := $(call kokkos_append_header,"\#define KOKKOS_ENABLE_PROFILING")
|
||||
endif
|
||||
|
||||
ifeq ($(KOKKOS_INTERNAL_DISABLE_DEPRECATED_CODE), 0)
|
||||
tmp := $(call kokkos_append_header,"\#define KOKKOS_ENABLE_DEPRECATED_CODE")
|
||||
endif
|
||||
|
||||
tmp := $(call kokkos_append_header,"/* Optimization Settings */")
|
||||
|
||||
ifeq ($(KOKKOS_INTERNAL_OPT_RANGE_AGGRESSIVE_VECTORIZATION), 1)
|
||||
|
@ -560,6 +588,24 @@ ifeq ($(KOKKOS_INTERNAL_USE_ARCH_ARMV8_THUNDERX), 1)
|
|||
endif
|
||||
endif
|
||||
|
||||
ifeq ($(KOKKOS_INTERNAL_USE_ARCH_ARMV8_THUNDERX2), 1)
|
||||
tmp := $(call kokkos_append_header,"\#define KOKKOS_ARCH_ARMV81")
|
||||
tmp := $(call kokkos_append_header,"\#define KOKKOS_ARCH_ARMV8_THUNDERX2")
|
||||
|
||||
ifeq ($(KOKKOS_INTERNAL_COMPILER_CRAY), 1)
|
||||
KOKKOS_CXXFLAGS +=
|
||||
KOKKOS_LDFLAGS +=
|
||||
else
|
||||
ifeq ($(KOKKOS_INTERNAL_COMPILER_PGI), 1)
|
||||
KOKKOS_CXXFLAGS +=
|
||||
KOKKOS_LDFLAGS +=
|
||||
else
|
||||
KOKKOS_CXXFLAGS += -mtune=thunderx2t99 -mcpu=thunderx2t99
|
||||
KOKKOS_LDFLAGS += -mtune=thunderx2t99 -mcpu=thunderx2t99
|
||||
endif
|
||||
endif
|
||||
endif
|
||||
|
||||
ifeq ($(KOKKOS_INTERNAL_USE_ARCH_SSE42), 1)
|
||||
tmp := $(call kokkos_append_header,"\#define KOKKOS_ARCH_SSE42")
|
||||
|
||||
|
@ -754,10 +800,11 @@ endif
|
|||
ifeq ($(KOKKOS_INTERNAL_USE_CUDA), 1)
|
||||
ifeq ($(KOKKOS_INTERNAL_COMPILER_NVCC), 1)
|
||||
KOKKOS_INTERNAL_CUDA_ARCH_FLAG=-arch
|
||||
endif
|
||||
ifeq ($(KOKKOS_INTERNAL_COMPILER_CLANG), 1)
|
||||
KOKKOS_INTERNAL_CUDA_ARCH_FLAG=--cuda-gpu-arch
|
||||
KOKKOS_CXXFLAGS += -x cuda
|
||||
else ifeq ($(KOKKOS_INTERNAL_COMPILER_CLANG), 1)
|
||||
KOKKOS_INTERNAL_CUDA_ARCH_FLAG=--cuda-gpu-arch
|
||||
KOKKOS_CXXFLAGS += -x cuda
|
||||
else
|
||||
$(error Makefile.kokkos: CUDA is enabled but the compiler is neither NVCC nor Clang)
|
||||
endif
|
||||
|
||||
ifeq ($(KOKKOS_INTERNAL_USE_ARCH_KEPLER30), 1)
|
||||
|
@ -805,6 +852,16 @@ ifeq ($(KOKKOS_INTERNAL_USE_CUDA), 1)
|
|||
tmp := $(call kokkos_append_header,"\#define KOKKOS_ARCH_PASCAL61")
|
||||
KOKKOS_INTERNAL_CUDA_ARCH_FLAG := $(KOKKOS_INTERNAL_CUDA_ARCH_FLAG)=sm_61
|
||||
endif
|
||||
ifeq ($(KOKKOS_INTERNAL_USE_ARCH_VOLTA70), 1)
|
||||
tmp := $(call kokkos_append_header,"\#define KOKKOS_ARCH_VOLTA")
|
||||
tmp := $(call kokkos_append_header,"\#define KOKKOS_ARCH_VOLTA70")
|
||||
KOKKOS_INTERNAL_CUDA_ARCH_FLAG := $(KOKKOS_INTERNAL_CUDA_ARCH_FLAG)=sm_70
|
||||
endif
|
||||
ifeq ($(KOKKOS_INTERNAL_USE_ARCH_VOLTA72), 1)
|
||||
tmp := $(call kokkos_append_header,"\#define KOKKOS_ARCH_VOLTA")
|
||||
tmp := $(call kokkos_append_header,"\#define KOKKOS_ARCH_VOLTA72")
|
||||
KOKKOS_INTERNAL_CUDA_ARCH_FLAG := $(KOKKOS_INTERNAL_CUDA_ARCH_FLAG)=sm_72
|
||||
endif
|
||||
|
||||
ifneq ($(KOKKOS_INTERNAL_USE_ARCH_NVIDIA), 0)
|
||||
KOKKOS_CXXFLAGS += $(KOKKOS_INTERNAL_CUDA_ARCH_FLAG)
|
||||
|
@ -850,6 +907,7 @@ ifeq ($(KOKKOS_INTERNAL_USE_ROCM), 1)
|
|||
|
||||
KOKKOS_CXXFLAGS += $(shell $(ROCM_HCC_PATH)/bin/hcc-config --cxxflags)
|
||||
KOKKOS_LDFLAGS += $(shell $(ROCM_HCC_PATH)/bin/hcc-config --ldflags) -lhc_am -lm
|
||||
KOKKOS_TPL_LIBRARY_NAMES += hc_am m
|
||||
KOKKOS_LDFLAGS += $(KOKKOS_INTERNAL_ROCM_ARCH_FLAG)
|
||||
|
||||
KOKKOS_SRC += $(wildcard $(KOKKOS_PATH)/core/src/ROCm/*.cpp)
|
||||
|
@ -880,13 +938,17 @@ KOKKOS_SRC += $(wildcard $(KOKKOS_PATH)/containers/src/impl/*.cpp)
|
|||
ifeq ($(KOKKOS_INTERNAL_USE_CUDA), 1)
|
||||
KOKKOS_SRC += $(wildcard $(KOKKOS_PATH)/core/src/Cuda/*.cpp)
|
||||
KOKKOS_HEADERS += $(wildcard $(KOKKOS_PATH)/core/src/Cuda/*.hpp)
|
||||
KOKKOS_CPPFLAGS += -I$(CUDA_PATH)/include
|
||||
KOKKOS_LDFLAGS += -L$(CUDA_PATH)/lib64
|
||||
KOKKOS_LIBS += -lcudart -lcuda
|
||||
|
||||
ifeq ($(KOKKOS_INTERNAL_COMPILER_CLANG), 1)
|
||||
KOKKOS_CXXFLAGS += --cuda-path=$(CUDA_PATH)
|
||||
ifneq ($(CUDA_PATH),)
|
||||
KOKKOS_CPPFLAGS += -I$(CUDA_PATH)/include
|
||||
KOKKOS_LDFLAGS += -L$(CUDA_PATH)/lib64
|
||||
KOKKOS_TPL_INCLUDE_DIRS += $(CUDA_PATH)/include
|
||||
KOKKOS_TPL_LIBRARY_DIRS += $(CUDA_PATH)/lib64
|
||||
ifeq ($(KOKKOS_INTERNAL_COMPILER_CLANG), 1)
|
||||
KOKKOS_CXXFLAGS += --cuda-path=$(CUDA_PATH)
|
||||
endif
|
||||
endif
|
||||
KOKKOS_LIBS += -lcudart -lcuda
|
||||
KOKKOS_TPL_LIBRARY_NAMES += cudart cuda
|
||||
endif
|
||||
|
||||
ifeq ($(KOKKOS_INTERNAL_USE_OPENMPTARGET), 1)
|
||||
|
@ -911,20 +973,27 @@ ifeq ($(KOKKOS_INTERNAL_USE_OPENMP), 1)
|
|||
endif
|
||||
|
||||
KOKKOS_LDFLAGS += $(KOKKOS_INTERNAL_OPENMP_FLAG)
|
||||
KOKKOS_LINK_FLAGS += $(KOKKOS_INTERNAL_OPENMP_FLAG)
|
||||
endif
|
||||
|
||||
ifeq ($(KOKKOS_INTERNAL_USE_PTHREADS), 1)
|
||||
KOKKOS_SRC += $(wildcard $(KOKKOS_PATH)/core/src/Threads/*.cpp)
|
||||
KOKKOS_HEADERS += $(wildcard $(KOKKOS_PATH)/core/src/Threads/*.hpp)
|
||||
KOKKOS_LIBS += -lpthread
|
||||
KOKKOS_TPL_LIBRARY_NAMES += pthread
|
||||
endif
|
||||
|
||||
ifeq ($(KOKKOS_INTERNAL_USE_QTHREADS), 1)
|
||||
KOKKOS_SRC += $(wildcard $(KOKKOS_PATH)/core/src/Qthreads/*.cpp)
|
||||
KOKKOS_HEADERS += $(wildcard $(KOKKOS_PATH)/core/src/Qthreads/*.hpp)
|
||||
KOKKOS_CPPFLAGS += -I$(QTHREADS_PATH)/include
|
||||
KOKKOS_LDFLAGS += -L$(QTHREADS_PATH)/lib
|
||||
ifneq ($(QTHREADS_PATH),)
|
||||
KOKKOS_CPPFLAGS += -I$(QTHREADS_PATH)/include
|
||||
KOKKOS_LDFLAGS += -L$(QTHREADS_PATH)/lib
|
||||
KOKKOS_TPL_INCLUDE_DIRS += $(QTHREADS_PATH)/include
|
||||
KOKKOS_TPL_LIBRARY_DIRS += $(QTHREADS_PATH)/lib64
|
||||
endif
|
||||
KOKKOS_LIBS += -lqthread
|
||||
KOKKOS_TPL_LIBRARY_NAMES += qthread
|
||||
endif
|
||||
|
||||
# Explicitly set the GCC Toolchain for Clang.
|
||||
|
@ -940,11 +1009,6 @@ ifneq ($(KOKKOS_INTERNAL_USE_MEMKIND), 1)
|
|||
KOKKOS_SRC := $(filter-out $(KOKKOS_PATH)/core/src/impl/Kokkos_HBWSpace.cpp,$(KOKKOS_SRC))
|
||||
endif
|
||||
|
||||
# Don't include Kokkos_Profiling_Interface.cpp if not using profiling to avoid a link warning.
|
||||
ifeq ($(KOKKOS_INTERNAL_DISABLE_PROFILING), 1)
|
||||
KOKKOS_SRC := $(filter-out $(KOKKOS_PATH)/core/src/impl/Kokkos_Profiling_Interface.cpp,$(KOKKOS_SRC))
|
||||
endif
|
||||
|
||||
# Don't include Kokkos_Serial.cpp or Kokkos_Serial_Task.cpp if not using Serial
|
||||
# device to avoid a link warning.
|
||||
ifneq ($(KOKKOS_INTERNAL_USE_SERIAL), 1)
|
||||
|
|
|
@ -1,87 +1,101 @@
|
|||
Kokkos implements a programming model in C++ for writing performance portable
|
||||
Kokkos Core implements a programming model in C++ for writing performance portable
|
||||
applications targeting all major HPC platforms. For that purpose it provides
|
||||
abstractions for both parallel execution of code and data management.
|
||||
Kokkos is designed to target complex node architectures with N-level memory
|
||||
hierarchies and multiple types of execution resources. It currently can use
|
||||
OpenMP, Pthreads and CUDA as backend programming models.
|
||||
|
||||
Kokkos is licensed under standard 3-clause BSD terms of use. For specifics
|
||||
see the LICENSE file contained in the repository or distribution.
|
||||
Kokkos Core is part of the Kokkos C++ Performance Portability Programming EcoSystem,
|
||||
which also provides math kernels (https://github.com/kokkos/kokkos-kernels), as well as
|
||||
profiling and debugging tools (https://github.com/kokkos/kokkos-tools).
|
||||
|
||||
The core developers of Kokkos are Carter Edwards and Christian Trott
|
||||
at the Computer Science Research Institute of the Sandia National
|
||||
Laboratories.
|
||||
# Learning about Kokkos
|
||||
|
||||
The KokkosP interface and associated tools are developed by the Application
|
||||
Performance Team and Kokkos core developers at Sandia National Laboratories.
|
||||
A programming guide can be found on the Wiki, the API reference is under development.
|
||||
|
||||
To learn more about Kokkos consider watching one of our presentations:
|
||||
GTC 2015:
|
||||
http://on-demand.gputechconf.com/gtc/2015/video/S5166.html
|
||||
http://on-demand.gputechconf.com/gtc/2015/presentation/S5166-H-Carter-Edwards.pdf
|
||||
For questions find us on Slack: https://kokkosteam.slack.com or open a github issue.
|
||||
|
||||
A programming guide can be found under doc/Kokkos_PG.pdf. This is an initial version
|
||||
and feedback is greatly appreciated.
|
||||
For non-public questions send an email to
|
||||
crtrott(at)sandia.gov
|
||||
|
||||
A separate repository with extensive tutorial material can be found under
|
||||
https://github.com/kokkos/kokkos-tutorials.
|
||||
|
||||
If you have a patch to contribute please feel free to issue a pull request against
|
||||
the develop branch. For major contributions it is better to contact us first
|
||||
for guidance.
|
||||
Furthermore, the 'example/tutorial' directory provides step by step tutorial
|
||||
examples which explain many of the features of Kokkos. They work with
|
||||
simple Makefiles. To build with g++ and OpenMP simply type 'make'
|
||||
in the 'example/tutorial' directory. This will build all examples in the
|
||||
subfolders. To change the build options refer to the Programming Guide
|
||||
in the compilation section.
|
||||
|
||||
For questions please send an email to
|
||||
kokkos-users@software.sandia.gov
|
||||
To learn more about Kokkos consider watching one of our presentations:
|
||||
* GTC 2015:
|
||||
- http://on-demand.gputechconf.com/gtc/2015/video/S5166.html
|
||||
- http://on-demand.gputechconf.com/gtc/2015/presentation/S5166-H-Carter-Edwards.pdf
|
||||
|
||||
For non-public questions send an email to
|
||||
hcedwar(at)sandia.gov and crtrott(at)sandia.gov
|
||||
|
||||
============================================================================
|
||||
====Requirements============================================================
|
||||
============================================================================
|
||||
# Contributing to Kokkos
|
||||
|
||||
Primary tested compilers on X86 are:
|
||||
GCC 4.8.4
|
||||
GCC 4.9.3
|
||||
GCC 5.1.0
|
||||
GCC 5.3.0
|
||||
GCC 6.1.0
|
||||
Intel 15.0.2
|
||||
Intel 16.0.1
|
||||
Intel 17.1.043
|
||||
Intel 17.4.196
|
||||
Intel 18.0.128
|
||||
Clang 3.5.2
|
||||
Clang 3.6.1
|
||||
Clang 3.7.1
|
||||
Clang 3.8.1
|
||||
Clang 3.9.0
|
||||
Clang 4.0.0
|
||||
Clang 4.0.0 for CUDA (CUDA Toolkit 8.0.44)
|
||||
PGI 17.10
|
||||
NVCC 7.0 for CUDA (with gcc 4.8.4)
|
||||
NVCC 7.5 for CUDA (with gcc 4.8.4)
|
||||
NVCC 8.0.44 for CUDA (with gcc 5.3.0)
|
||||
We are open and try to encourage contributions from external developers.
|
||||
To do so please first open an issue describing the contribution and then issue
|
||||
a pull request against the develop branch. For larger features it may be good
|
||||
to get guidance from the core development team first through the github issue.
|
||||
|
||||
Primary tested compilers on Power 8 are:
|
||||
GCC 5.4.0 (OpenMP,Serial)
|
||||
IBM XL 13.1.5 (OpenMP, Serial) (There is a workaround in place to avoid a compiler bug)
|
||||
NVCC 8.0.44 for CUDA (with gcc 5.4.0)
|
||||
NVCC 9.0.103 for CUDA (with gcc 6.3.0)
|
||||
Note that Kokkos Core is licensed under standard 3-clause BSD terms of use.
|
||||
Which means contributing to Kokkos allows anyone else to use your contributions
|
||||
not just for public purposes but also for closed source commercial projects.
|
||||
For specifics see the LICENSE file contained in the repository or distribution.
|
||||
|
||||
Primary tested compilers on Intel KNL are:
|
||||
GCC 6.2.0
|
||||
Intel 16.4.258 (with gcc 4.7.2)
|
||||
Intel 17.2.174 (with gcc 4.9.3)
|
||||
Intel 18.0.128 (with gcc 4.9.3)
|
||||
# Requirements
|
||||
|
||||
Other compilers working:
|
||||
X86:
|
||||
Cygwin 2.1.0 64bit with gcc 4.9.3
|
||||
### Primary tested compilers on X86 are:
|
||||
* GCC 4.8.4
|
||||
* GCC 4.9.3
|
||||
* GCC 5.1.0
|
||||
* GCC 5.3.0
|
||||
* GCC 6.1.0
|
||||
* Intel 15.0.2
|
||||
* Intel 16.0.1
|
||||
* Intel 17.1.043
|
||||
* Intel 17.4.196
|
||||
* Intel 18.0.128
|
||||
* Clang 3.6.1
|
||||
* Clang 3.7.1
|
||||
* Clang 3.8.1
|
||||
* Clang 3.9.0
|
||||
* Clang 4.0.0
|
||||
* Clang 4.0.0 for CUDA (CUDA Toolkit 8.0.44)
|
||||
* Clang 6.0.0 for CUDA (CUDA Toolkit 9.1)
|
||||
* PGI 17.10
|
||||
* NVCC 7.0 for CUDA (with gcc 4.8.4)
|
||||
* NVCC 7.5 for CUDA (with gcc 4.8.4)
|
||||
* NVCC 8.0.44 for CUDA (with gcc 5.3.0)
|
||||
* NVCC 9.1 for CUDA (with gcc 6.1.0)
|
||||
|
||||
Known non-working combinations:
|
||||
Power8:
|
||||
Pthreads backend
|
||||
### Primary tested compilers on Power 8 are:
|
||||
* GCC 5.4.0 (OpenMP,Serial)
|
||||
* IBM XL 13.1.6 (OpenMP, Serial)
|
||||
* NVCC 8.0.44 for CUDA (with gcc 5.4.0)
|
||||
* NVCC 9.0.103 for CUDA (with gcc 6.3.0 and XL 13.1.6)
|
||||
|
||||
### Primary tested compilers on Intel KNL are:
|
||||
* GCC 6.2.0
|
||||
* Intel 16.4.258 (with gcc 4.7.2)
|
||||
* Intel 17.2.174 (with gcc 4.9.3)
|
||||
* Intel 18.0.128 (with gcc 4.9.3)
|
||||
|
||||
### Primary tested compilers on ARM
|
||||
* GCC 6.1.0
|
||||
|
||||
### Other compilers working:
|
||||
* X86:
|
||||
- Cygwin 2.1.0 64bit with gcc 4.9.3
|
||||
|
||||
### Known non-working combinations:
|
||||
* Power8:
|
||||
- Pthreads backend
|
||||
* ARM
|
||||
- Pthreads backend
|
||||
|
||||
|
||||
Primary tested compiler are passing in release mode
|
||||
|
@ -97,20 +111,7 @@ NVCC: -Wall -Wshadow -pedantic -Werror -Wsign-compare -Wtype-limits -Wuninitiali
|
|||
Other compilers are tested occasionally, in particular when pushing from develop to
|
||||
master branch, without -Werror and only for a select set of backends.
|
||||
|
||||
============================================================================
|
||||
====Getting started=========================================================
|
||||
============================================================================
|
||||
|
||||
In the 'example/tutorial' directory you will find step by step tutorial
|
||||
examples which explain many of the features of Kokkos. They work with
|
||||
simple Makefiles. To build with g++ and OpenMP simply type 'make'
|
||||
in the 'example/tutorial' directory. This will build all examples in the
|
||||
subfolders. To change the build options refer to the Programming Guide
|
||||
in the compilation section.
|
||||
|
||||
============================================================================
|
||||
====Running Unit Tests======================================================
|
||||
============================================================================
|
||||
# Running Unit Tests
|
||||
|
||||
To run the unit tests create a build directory and run the following commands
|
||||
|
||||
|
@ -121,30 +122,35 @@ make test
|
|||
Run KOKKOS_PATH/generate_makefile.bash --help for more detailed options such as
|
||||
changing the device type for which to build.
|
||||
|
||||
============================================================================
|
||||
====Install the library=====================================================
|
||||
============================================================================
|
||||
# Installing the library
|
||||
|
||||
To install Kokkos as a library create a build directory and run the following
|
||||
|
||||
KOKKOS_PATH/generate_makefile.bash --prefix=INSTALL_PATH
|
||||
make lib
|
||||
make kokkoslib
|
||||
make install
|
||||
|
||||
KOKKOS_PATH/generate_makefile.bash --help for more detailed options such as
|
||||
changing the device type for which to build.
|
||||
|
||||
============================================================================
|
||||
====CMakeFiles==============================================================
|
||||
============================================================================
|
||||
Note that in many cases it is preferable to build Kokkos inline with an
|
||||
application. The main reason is that you may otherwise need many different
|
||||
configurations of Kokkos installed depending on the required compile time
|
||||
features an application needs. For example there is only one default
|
||||
execution space, which means you need different installations to have OpenMP
|
||||
or Pthreads as the default space. Also for the CUDA backend there are certain
|
||||
choices, such as allowing relocatable device code, which must be made at
|
||||
installation time. Building Kokkos inline uses largely the same process
|
||||
as compiling an application against an installed Kokkos library. See for
|
||||
example benchmarks/bytes_and_flops/Makefile which can be used with an installed
|
||||
library and for an inline build.
|
||||
|
||||
The CMake files contained in this repository require Tribits and are used
|
||||
for integration with Trilinos. They do not currently support a standalone
|
||||
CMake build.
|
||||
### CMake
|
||||
|
||||
===========================================================================
|
||||
====Kokkos and CUDA UVM====================================================
|
||||
===========================================================================
|
||||
Kokkos supports being build as part of a CMake applications. An example can
|
||||
be found in example/cmake_build.
|
||||
|
||||
# Kokkos and CUDA UVM
|
||||
|
||||
Kokkos does support UVM as a specific memory space called CudaUVMSpace.
|
||||
Allocations made with that space are accessible from host and device.
|
||||
|
@ -154,25 +160,16 @@ In either case UVM comes with a number of restrictions:
|
|||
running. This will lead to segfaults. To avoid that you either need to
|
||||
call Kokkos::Cuda::fence() (or just Kokkos::fence()), after kernels, or
|
||||
you can set the environment variable CUDA_LAUNCH_BLOCKING=1.
|
||||
Furthermore in multi socket multi GPU machines, UVM defaults to using
|
||||
zero copy allocations for technical reasons related to using multiple
|
||||
Furthermore in multi socket multi GPU machines without NVLINK, UVM defaults
|
||||
to using zero copy allocations for technical reasons related to using multiple
|
||||
GPUs from the same process. If an executable doesn't do that (e.g. each
|
||||
MPI rank of an application uses a single GPU [can be the same GPU for
|
||||
multiple MPI ranks]) you can set CUDA_MANAGED_FORCE_DEVICE_ALLOC=1.
|
||||
This will enforce proper UVM allocations, but can lead to errors if
|
||||
more than a single GPU is used by a single process.
|
||||
|
||||
===========================================================================
|
||||
====Contributing===========================================================
|
||||
===========================================================================
|
||||
|
||||
Contributions to Kokkos are welcome. In order to do so, please open an issue
|
||||
where a feature request or bug can be discussed. Then issue a pull request
|
||||
with your contribution. Pull requests must be issued against the develop branch.
|
||||
|
||||
===========================================================================
|
||||
====Citing Kokkos==========================================================
|
||||
===========================================================================
|
||||
# Citing Kokkos
|
||||
|
||||
If you publish work which mentions Kokkos, please cite the following paper:
|
||||
|
||||
|
|
|
@ -35,7 +35,7 @@
|
|||
// NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE OF THIS
|
||||
// SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.
|
||||
//
|
||||
// Questions? Contact H. Carter Edwards (hcedwar@sandia.gov)
|
||||
// Questions? Contact Christian R. Trott (crtrott@sandia.gov)
|
||||
//
|
||||
// ************************************************************************
|
||||
//@HEADER
|
||||
|
@ -1530,7 +1530,7 @@ struct fill_random_functor_range<ViewType,RandomPool,loops,1,IndexType>{
|
|||
typename RandomPool::generator_type gen = rand_pool.get_state();
|
||||
for(IndexType j=0;j<loops;j++) {
|
||||
const IndexType idx = i*loops+j;
|
||||
if(idx<static_cast<IndexType>(a.dimension_0()))
|
||||
if(idx<static_cast<IndexType>(a.extent(0)))
|
||||
a(idx) = Rand::draw(gen,range);
|
||||
}
|
||||
rand_pool.free_state(gen);
|
||||
|
@ -1555,8 +1555,8 @@ struct fill_random_functor_range<ViewType,RandomPool,loops,2,IndexType>{
|
|||
typename RandomPool::generator_type gen = rand_pool.get_state();
|
||||
for(IndexType j=0;j<loops;j++) {
|
||||
const IndexType idx = i*loops+j;
|
||||
if(idx<static_cast<IndexType>(a.dimension_0())) {
|
||||
for(IndexType k=0;k<static_cast<IndexType>(a.dimension_1());k++)
|
||||
if(idx<static_cast<IndexType>(a.extent(0))) {
|
||||
for(IndexType k=0;k<static_cast<IndexType>(a.extent(1));k++)
|
||||
a(idx,k) = Rand::draw(gen,range);
|
||||
}
|
||||
}
|
||||
|
@ -1583,9 +1583,9 @@ struct fill_random_functor_range<ViewType,RandomPool,loops,3,IndexType>{
|
|||
typename RandomPool::generator_type gen = rand_pool.get_state();
|
||||
for(IndexType j=0;j<loops;j++) {
|
||||
const IndexType idx = i*loops+j;
|
||||
if(idx<static_cast<IndexType>(a.dimension_0())) {
|
||||
for(IndexType k=0;k<static_cast<IndexType>(a.dimension_1());k++)
|
||||
for(IndexType l=0;l<static_cast<IndexType>(a.dimension_2());l++)
|
||||
if(idx<static_cast<IndexType>(a.extent(0))) {
|
||||
for(IndexType k=0;k<static_cast<IndexType>(a.extent(1));k++)
|
||||
for(IndexType l=0;l<static_cast<IndexType>(a.extent(2));l++)
|
||||
a(idx,k,l) = Rand::draw(gen,range);
|
||||
}
|
||||
}
|
||||
|
@ -1611,10 +1611,10 @@ struct fill_random_functor_range<ViewType,RandomPool,loops,4, IndexType>{
|
|||
typename RandomPool::generator_type gen = rand_pool.get_state();
|
||||
for(IndexType j=0;j<loops;j++) {
|
||||
const IndexType idx = i*loops+j;
|
||||
if(idx<static_cast<IndexType>(a.dimension_0())) {
|
||||
for(IndexType k=0;k<static_cast<IndexType>(a.dimension_1());k++)
|
||||
for(IndexType l=0;l<static_cast<IndexType>(a.dimension_2());l++)
|
||||
for(IndexType m=0;m<static_cast<IndexType>(a.dimension_3());m++)
|
||||
if(idx<static_cast<IndexType>(a.extent(0))) {
|
||||
for(IndexType k=0;k<static_cast<IndexType>(a.extent(1));k++)
|
||||
for(IndexType l=0;l<static_cast<IndexType>(a.extent(2));l++)
|
||||
for(IndexType m=0;m<static_cast<IndexType>(a.extent(3));m++)
|
||||
a(idx,k,l,m) = Rand::draw(gen,range);
|
||||
}
|
||||
}
|
||||
|
@ -1640,11 +1640,11 @@ struct fill_random_functor_range<ViewType,RandomPool,loops,5,IndexType>{
|
|||
typename RandomPool::generator_type gen = rand_pool.get_state();
|
||||
for(IndexType j=0;j<loops;j++) {
|
||||
const IndexType idx = i*loops+j;
|
||||
if(idx<static_cast<IndexType>(a.dimension_0())) {
|
||||
for(IndexType k=0;k<static_cast<IndexType>(a.dimension_1());k++)
|
||||
for(IndexType l=0;l<static_cast<IndexType>(a.dimension_2());l++)
|
||||
for(IndexType m=0;m<static_cast<IndexType>(a.dimension_3());m++)
|
||||
for(IndexType n=0;n<static_cast<IndexType>(a.dimension_4());n++)
|
||||
if(idx<static_cast<IndexType>(a.extent(0))) {
|
||||
for(IndexType k=0;k<static_cast<IndexType>(a.extent(1));k++)
|
||||
for(IndexType l=0;l<static_cast<IndexType>(a.extent(2));l++)
|
||||
for(IndexType m=0;m<static_cast<IndexType>(a.extent(3));m++)
|
||||
for(IndexType n=0;n<static_cast<IndexType>(a.extent(4));n++)
|
||||
a(idx,k,l,m,n) = Rand::draw(gen,range);
|
||||
}
|
||||
}
|
||||
|
@ -1670,12 +1670,12 @@ struct fill_random_functor_range<ViewType,RandomPool,loops,6,IndexType>{
|
|||
typename RandomPool::generator_type gen = rand_pool.get_state();
|
||||
for(IndexType j=0;j<loops;j++) {
|
||||
const IndexType idx = i*loops+j;
|
||||
if(idx<static_cast<IndexType>(a.dimension_0())) {
|
||||
for(IndexType k=0;k<static_cast<IndexType>(a.dimension_1());k++)
|
||||
for(IndexType l=0;l<static_cast<IndexType>(a.dimension_2());l++)
|
||||
for(IndexType m=0;m<static_cast<IndexType>(a.dimension_3());m++)
|
||||
for(IndexType n=0;n<static_cast<IndexType>(a.dimension_4());n++)
|
||||
for(IndexType o=0;o<static_cast<IndexType>(a.dimension_5());o++)
|
||||
if(idx<static_cast<IndexType>(a.extent(0))) {
|
||||
for(IndexType k=0;k<static_cast<IndexType>(a.extent(1));k++)
|
||||
for(IndexType l=0;l<static_cast<IndexType>(a.extent(2));l++)
|
||||
for(IndexType m=0;m<static_cast<IndexType>(a.extent(3));m++)
|
||||
for(IndexType n=0;n<static_cast<IndexType>(a.extent(4));n++)
|
||||
for(IndexType o=0;o<static_cast<IndexType>(a.extent(5));o++)
|
||||
a(idx,k,l,m,n,o) = Rand::draw(gen,range);
|
||||
}
|
||||
}
|
||||
|
@ -1701,13 +1701,13 @@ struct fill_random_functor_range<ViewType,RandomPool,loops,7,IndexType>{
|
|||
typename RandomPool::generator_type gen = rand_pool.get_state();
|
||||
for(IndexType j=0;j<loops;j++) {
|
||||
const IndexType idx = i*loops+j;
|
||||
if(idx<static_cast<IndexType>(a.dimension_0())) {
|
||||
for(IndexType k=0;k<static_cast<IndexType>(a.dimension_1());k++)
|
||||
for(IndexType l=0;l<static_cast<IndexType>(a.dimension_2());l++)
|
||||
for(IndexType m=0;m<static_cast<IndexType>(a.dimension_3());m++)
|
||||
for(IndexType n=0;n<static_cast<IndexType>(a.dimension_4());n++)
|
||||
for(IndexType o=0;o<static_cast<IndexType>(a.dimension_5());o++)
|
||||
for(IndexType p=0;p<static_cast<IndexType>(a.dimension_6());p++)
|
||||
if(idx<static_cast<IndexType>(a.extent(0))) {
|
||||
for(IndexType k=0;k<static_cast<IndexType>(a.extent(1));k++)
|
||||
for(IndexType l=0;l<static_cast<IndexType>(a.extent(2));l++)
|
||||
for(IndexType m=0;m<static_cast<IndexType>(a.extent(3));m++)
|
||||
for(IndexType n=0;n<static_cast<IndexType>(a.extent(4));n++)
|
||||
for(IndexType o=0;o<static_cast<IndexType>(a.extent(5));o++)
|
||||
for(IndexType p=0;p<static_cast<IndexType>(a.extent(6));p++)
|
||||
a(idx,k,l,m,n,o,p) = Rand::draw(gen,range);
|
||||
}
|
||||
}
|
||||
|
@ -1733,14 +1733,14 @@ struct fill_random_functor_range<ViewType,RandomPool,loops,8,IndexType>{
|
|||
typename RandomPool::generator_type gen = rand_pool.get_state();
|
||||
for(IndexType j=0;j<loops;j++) {
|
||||
const IndexType idx = i*loops+j;
|
||||
if(idx<static_cast<IndexType>(a.dimension_0())) {
|
||||
for(IndexType k=0;k<static_cast<IndexType>(a.dimension_1());k++)
|
||||
for(IndexType l=0;l<static_cast<IndexType>(a.dimension_2());l++)
|
||||
for(IndexType m=0;m<static_cast<IndexType>(a.dimension_3());m++)
|
||||
for(IndexType n=0;n<static_cast<IndexType>(a.dimension_4());n++)
|
||||
for(IndexType o=0;o<static_cast<IndexType>(a.dimension_5());o++)
|
||||
for(IndexType p=0;p<static_cast<IndexType>(a.dimension_6());p++)
|
||||
for(IndexType q=0;q<static_cast<IndexType>(a.dimension_7());q++)
|
||||
if(idx<static_cast<IndexType>(a.extent(0))) {
|
||||
for(IndexType k=0;k<static_cast<IndexType>(a.extent(1));k++)
|
||||
for(IndexType l=0;l<static_cast<IndexType>(a.extent(2));l++)
|
||||
for(IndexType m=0;m<static_cast<IndexType>(a.extent(3));m++)
|
||||
for(IndexType n=0;n<static_cast<IndexType>(a.extent(4));n++)
|
||||
for(IndexType o=0;o<static_cast<IndexType>(a.extent(5));o++)
|
||||
for(IndexType p=0;p<static_cast<IndexType>(a.extent(6));p++)
|
||||
for(IndexType q=0;q<static_cast<IndexType>(a.extent(7));q++)
|
||||
a(idx,k,l,m,n,o,p,q) = Rand::draw(gen,range);
|
||||
}
|
||||
}
|
||||
|
@ -1765,7 +1765,7 @@ struct fill_random_functor_begin_end<ViewType,RandomPool,loops,1,IndexType>{
|
|||
typename RandomPool::generator_type gen = rand_pool.get_state();
|
||||
for(IndexType j=0;j<loops;j++) {
|
||||
const IndexType idx = i*loops+j;
|
||||
if(idx<static_cast<IndexType>(a.dimension_0()))
|
||||
if(idx<static_cast<IndexType>(a.extent(0)))
|
||||
a(idx) = Rand::draw(gen,begin,end);
|
||||
}
|
||||
rand_pool.free_state(gen);
|
||||
|
@ -1790,8 +1790,8 @@ struct fill_random_functor_begin_end<ViewType,RandomPool,loops,2,IndexType>{
|
|||
typename RandomPool::generator_type gen = rand_pool.get_state();
|
||||
for(IndexType j=0;j<loops;j++) {
|
||||
const IndexType idx = i*loops+j;
|
||||
if(idx<static_cast<IndexType>(a.dimension_0())) {
|
||||
for(IndexType k=0;k<static_cast<IndexType>(a.dimension_1());k++)
|
||||
if(idx<static_cast<IndexType>(a.extent(0))) {
|
||||
for(IndexType k=0;k<static_cast<IndexType>(a.extent(1));k++)
|
||||
a(idx,k) = Rand::draw(gen,begin,end);
|
||||
}
|
||||
}
|
||||
|
@ -1818,9 +1818,9 @@ struct fill_random_functor_begin_end<ViewType,RandomPool,loops,3,IndexType>{
|
|||
typename RandomPool::generator_type gen = rand_pool.get_state();
|
||||
for(IndexType j=0;j<loops;j++) {
|
||||
const IndexType idx = i*loops+j;
|
||||
if(idx<static_cast<IndexType>(a.dimension_0())) {
|
||||
for(IndexType k=0;k<static_cast<IndexType>(a.dimension_1());k++)
|
||||
for(IndexType l=0;l<static_cast<IndexType>(a.dimension_2());l++)
|
||||
if(idx<static_cast<IndexType>(a.extent(0))) {
|
||||
for(IndexType k=0;k<static_cast<IndexType>(a.extent(1));k++)
|
||||
for(IndexType l=0;l<static_cast<IndexType>(a.extent(2));l++)
|
||||
a(idx,k,l) = Rand::draw(gen,begin,end);
|
||||
}
|
||||
}
|
||||
|
@ -1846,10 +1846,10 @@ struct fill_random_functor_begin_end<ViewType,RandomPool,loops,4,IndexType>{
|
|||
typename RandomPool::generator_type gen = rand_pool.get_state();
|
||||
for(IndexType j=0;j<loops;j++) {
|
||||
const IndexType idx = i*loops+j;
|
||||
if(idx<static_cast<IndexType>(a.dimension_0())) {
|
||||
for(IndexType k=0;k<static_cast<IndexType>(a.dimension_1());k++)
|
||||
for(IndexType l=0;l<static_cast<IndexType>(a.dimension_2());l++)
|
||||
for(IndexType m=0;m<static_cast<IndexType>(a.dimension_3());m++)
|
||||
if(idx<static_cast<IndexType>(a.extent(0))) {
|
||||
for(IndexType k=0;k<static_cast<IndexType>(a.extent(1));k++)
|
||||
for(IndexType l=0;l<static_cast<IndexType>(a.extent(2));l++)
|
||||
for(IndexType m=0;m<static_cast<IndexType>(a.extent(3));m++)
|
||||
a(idx,k,l,m) = Rand::draw(gen,begin,end);
|
||||
}
|
||||
}
|
||||
|
@ -1875,11 +1875,11 @@ struct fill_random_functor_begin_end<ViewType,RandomPool,loops,5,IndexType>{
|
|||
typename RandomPool::generator_type gen = rand_pool.get_state();
|
||||
for(IndexType j=0;j<loops;j++) {
|
||||
const IndexType idx = i*loops+j;
|
||||
if(idx<static_cast<IndexType>(a.dimension_0())){
|
||||
for(IndexType l=0;l<static_cast<IndexType>(a.dimension_1());l++)
|
||||
for(IndexType m=0;m<static_cast<IndexType>(a.dimension_2());m++)
|
||||
for(IndexType n=0;n<static_cast<IndexType>(a.dimension_3());n++)
|
||||
for(IndexType o=0;o<static_cast<IndexType>(a.dimension_4());o++)
|
||||
if(idx<static_cast<IndexType>(a.extent(0))){
|
||||
for(IndexType l=0;l<static_cast<IndexType>(a.extent(1));l++)
|
||||
for(IndexType m=0;m<static_cast<IndexType>(a.extent(2));m++)
|
||||
for(IndexType n=0;n<static_cast<IndexType>(a.extent(3));n++)
|
||||
for(IndexType o=0;o<static_cast<IndexType>(a.extent(4));o++)
|
||||
a(idx,l,m,n,o) = Rand::draw(gen,begin,end);
|
||||
}
|
||||
}
|
||||
|
@ -1905,12 +1905,12 @@ struct fill_random_functor_begin_end<ViewType,RandomPool,loops,6,IndexType>{
|
|||
typename RandomPool::generator_type gen = rand_pool.get_state();
|
||||
for(IndexType j=0;j<loops;j++) {
|
||||
const IndexType idx = i*loops+j;
|
||||
if(idx<static_cast<IndexType>(a.dimension_0())) {
|
||||
for(IndexType k=0;k<static_cast<IndexType>(a.dimension_1());k++)
|
||||
for(IndexType l=0;l<static_cast<IndexType>(a.dimension_2());l++)
|
||||
for(IndexType m=0;m<static_cast<IndexType>(a.dimension_3());m++)
|
||||
for(IndexType n=0;n<static_cast<IndexType>(a.dimension_4());n++)
|
||||
for(IndexType o=0;o<static_cast<IndexType>(a.dimension_5());o++)
|
||||
if(idx<static_cast<IndexType>(a.extent(0))) {
|
||||
for(IndexType k=0;k<static_cast<IndexType>(a.extent(1));k++)
|
||||
for(IndexType l=0;l<static_cast<IndexType>(a.extent(2));l++)
|
||||
for(IndexType m=0;m<static_cast<IndexType>(a.extent(3));m++)
|
||||
for(IndexType n=0;n<static_cast<IndexType>(a.extent(4));n++)
|
||||
for(IndexType o=0;o<static_cast<IndexType>(a.extent(5));o++)
|
||||
a(idx,k,l,m,n,o) = Rand::draw(gen,begin,end);
|
||||
}
|
||||
}
|
||||
|
@ -1937,13 +1937,13 @@ struct fill_random_functor_begin_end<ViewType,RandomPool,loops,7,IndexType>{
|
|||
typename RandomPool::generator_type gen = rand_pool.get_state();
|
||||
for(IndexType j=0;j<loops;j++) {
|
||||
const IndexType idx = i*loops+j;
|
||||
if(idx<static_cast<IndexType>(a.dimension_0())) {
|
||||
for(IndexType k=0;k<static_cast<IndexType>(a.dimension_1());k++)
|
||||
for(IndexType l=0;l<static_cast<IndexType>(a.dimension_2());l++)
|
||||
for(IndexType m=0;m<static_cast<IndexType>(a.dimension_3());m++)
|
||||
for(IndexType n=0;n<static_cast<IndexType>(a.dimension_4());n++)
|
||||
for(IndexType o=0;o<static_cast<IndexType>(a.dimension_5());o++)
|
||||
for(IndexType p=0;p<static_cast<IndexType>(a.dimension_6());p++)
|
||||
if(idx<static_cast<IndexType>(a.extent(0))) {
|
||||
for(IndexType k=0;k<static_cast<IndexType>(a.extent(1));k++)
|
||||
for(IndexType l=0;l<static_cast<IndexType>(a.extent(2));l++)
|
||||
for(IndexType m=0;m<static_cast<IndexType>(a.extent(3));m++)
|
||||
for(IndexType n=0;n<static_cast<IndexType>(a.extent(4));n++)
|
||||
for(IndexType o=0;o<static_cast<IndexType>(a.extent(5));o++)
|
||||
for(IndexType p=0;p<static_cast<IndexType>(a.extent(6));p++)
|
||||
a(idx,k,l,m,n,o,p) = Rand::draw(gen,begin,end);
|
||||
}
|
||||
}
|
||||
|
@ -1969,14 +1969,14 @@ struct fill_random_functor_begin_end<ViewType,RandomPool,loops,8,IndexType>{
|
|||
typename RandomPool::generator_type gen = rand_pool.get_state();
|
||||
for(IndexType j=0;j<loops;j++) {
|
||||
const IndexType idx = i*loops+j;
|
||||
if(idx<static_cast<IndexType>(a.dimension_0())) {
|
||||
for(IndexType k=0;k<static_cast<IndexType>(a.dimension_1());k++)
|
||||
for(IndexType l=0;l<static_cast<IndexType>(a.dimension_2());l++)
|
||||
for(IndexType m=0;m<static_cast<IndexType>(a.dimension_3());m++)
|
||||
for(IndexType n=0;n<static_cast<IndexType>(a.dimension_4());n++)
|
||||
for(IndexType o=0;o<static_cast<IndexType>(a.dimension_5());o++)
|
||||
for(IndexType p=0;p<static_cast<IndexType>(a.dimension_6());p++)
|
||||
for(IndexType q=0;q<static_cast<IndexType>(a.dimension_7());q++)
|
||||
if(idx<static_cast<IndexType>(a.extent(0))) {
|
||||
for(IndexType k=0;k<static_cast<IndexType>(a.extent(1));k++)
|
||||
for(IndexType l=0;l<static_cast<IndexType>(a.extent(2));l++)
|
||||
for(IndexType m=0;m<static_cast<IndexType>(a.extent(3));m++)
|
||||
for(IndexType n=0;n<static_cast<IndexType>(a.extent(4));n++)
|
||||
for(IndexType o=0;o<static_cast<IndexType>(a.extent(5));o++)
|
||||
for(IndexType p=0;p<static_cast<IndexType>(a.extent(6));p++)
|
||||
for(IndexType q=0;q<static_cast<IndexType>(a.extent(7));q++)
|
||||
a(idx,k,l,m,n,o,p,q) = Rand::draw(gen,begin,end);
|
||||
}
|
||||
}
|
||||
|
@ -1988,14 +1988,14 @@ struct fill_random_functor_begin_end<ViewType,RandomPool,loops,8,IndexType>{
|
|||
|
||||
template<class ViewType, class RandomPool, class IndexType = int64_t>
|
||||
void fill_random(ViewType a, RandomPool g, typename ViewType::const_value_type range) {
|
||||
int64_t LDA = a.dimension_0();
|
||||
int64_t LDA = a.extent(0);
|
||||
if(LDA>0)
|
||||
parallel_for((LDA+127)/128,Impl::fill_random_functor_range<ViewType,RandomPool,128,ViewType::Rank,IndexType>(a,g,range));
|
||||
}
|
||||
|
||||
template<class ViewType, class RandomPool, class IndexType = int64_t>
|
||||
void fill_random(ViewType a, RandomPool g, typename ViewType::const_value_type begin,typename ViewType::const_value_type end ) {
|
||||
int64_t LDA = a.dimension_0();
|
||||
int64_t LDA = a.extent(0);
|
||||
if(LDA>0)
|
||||
parallel_for((LDA+127)/128,Impl::fill_random_functor_begin_end<ViewType,RandomPool,128,ViewType::Rank,IndexType>(a,g,begin,end));
|
||||
}
|
||||
|
|
|
@ -35,7 +35,7 @@
|
|||
// NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE OF THIS
|
||||
// SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.
|
||||
//
|
||||
// Questions? Contact H. Carter Edwards (hcedwar@sandia.gov)
|
||||
// Questions? Contact Christian R. Trott (crtrott@sandia.gov)
|
||||
//
|
||||
// ************************************************************************
|
||||
//@HEADER
|
||||
|
@ -120,7 +120,6 @@ public:
|
|||
|
||||
KOKKOS_INLINE_FUNCTION
|
||||
void operator() (const int& i) const {
|
||||
// printf("copy: dst(%i) src(%i)\n",i+dst_offset,i);
|
||||
copy_op::copy(dst_values,i+dst_offset,src_values,i);
|
||||
}
|
||||
};
|
||||
|
@ -151,20 +150,22 @@ public:
|
|||
DstViewType dst_values ;
|
||||
perm_view_type sort_order ;
|
||||
src_view_type src_values ;
|
||||
int src_offset ;
|
||||
|
||||
copy_permute_functor( DstViewType const & dst_values_
|
||||
, PermuteViewType const & sort_order_
|
||||
, SrcViewType const & src_values_
|
||||
, int const & src_offset_
|
||||
)
|
||||
: dst_values( dst_values_ )
|
||||
, sort_order( sort_order_ )
|
||||
, src_values( src_values_ )
|
||||
, src_offset( src_offset_ )
|
||||
{}
|
||||
|
||||
KOKKOS_INLINE_FUNCTION
|
||||
void operator() (const int& i) const {
|
||||
// printf("copy_permute: dst(%i) src(%i)\n",i,sort_order(i));
|
||||
copy_op::copy(dst_values,i,src_values,sort_order(i));
|
||||
copy_op::copy(dst_values,i,src_values,src_offset+sort_order(i));
|
||||
}
|
||||
};
|
||||
|
||||
|
@ -259,19 +260,21 @@ public:
|
|||
// Create the permutation vector, the bin_offset array and the bin_count array. Can be called again if keys changed
|
||||
void create_permute_vector() {
|
||||
const size_t len = range_end - range_begin ;
|
||||
Kokkos::parallel_for (Kokkos::RangePolicy<execution_space,bin_count_tag> (0,len),*this);
|
||||
Kokkos::parallel_scan(Kokkos::RangePolicy<execution_space,bin_offset_tag> (0,bin_op.max_bins()) ,*this);
|
||||
Kokkos::parallel_for ("Kokkos::Sort::BinCount",Kokkos::RangePolicy<execution_space,bin_count_tag> (0,len),*this);
|
||||
Kokkos::parallel_scan("Kokkos::Sort::BinOffset",Kokkos::RangePolicy<execution_space,bin_offset_tag> (0,bin_op.max_bins()) ,*this);
|
||||
|
||||
Kokkos::deep_copy(bin_count_atomic,0);
|
||||
Kokkos::parallel_for (Kokkos::RangePolicy<execution_space,bin_binning_tag> (0,len),*this);
|
||||
Kokkos::parallel_for ("Kokkos::Sort::BinBinning",Kokkos::RangePolicy<execution_space,bin_binning_tag> (0,len),*this);
|
||||
|
||||
if(sort_within_bins)
|
||||
Kokkos::parallel_for (Kokkos::RangePolicy<execution_space,bin_sort_bins_tag>(0,bin_op.max_bins()) ,*this);
|
||||
Kokkos::parallel_for ("Kokkos::Sort::BinSort",Kokkos::RangePolicy<execution_space,bin_sort_bins_tag>(0,bin_op.max_bins()) ,*this);
|
||||
}
|
||||
|
||||
// Sort a view with respect ot the first dimension using the permutation array
|
||||
// Sort a subset of a view with respect to the first dimension using the permutation array
|
||||
template<class ValuesViewType>
|
||||
void sort( ValuesViewType const & values)
|
||||
void sort( ValuesViewType const & values
|
||||
, int values_range_begin
|
||||
, int values_range_end) const
|
||||
{
|
||||
typedef
|
||||
Kokkos::View< typename ValuesViewType::data_type,
|
||||
|
@ -280,6 +283,10 @@ public:
|
|||
scratch_view_type ;
|
||||
|
||||
const size_t len = range_end - range_begin ;
|
||||
const size_t values_len = values_range_end - values_range_begin ;
|
||||
if (len != values_len) {
|
||||
Kokkos::abort("BinSort::sort: values range length != permutation vector length");
|
||||
}
|
||||
|
||||
scratch_view_type
|
||||
sorted_values("Scratch",
|
||||
|
@ -297,19 +304,25 @@ public:
|
|||
, offset_type /* PermuteViewType */
|
||||
, ValuesViewType /* SrcViewType */
|
||||
>
|
||||
functor( sorted_values , sort_order , values );
|
||||
functor( sorted_values , sort_order , values, values_range_begin - range_begin );
|
||||
|
||||
parallel_for( Kokkos::RangePolicy<execution_space>(0,len),functor);
|
||||
parallel_for("Kokkos::Sort::CopyPermute", Kokkos::RangePolicy<execution_space>(0,len),functor);
|
||||
}
|
||||
|
||||
{
|
||||
copy_functor< ValuesViewType , scratch_view_type >
|
||||
functor( values , range_begin , sorted_values );
|
||||
|
||||
parallel_for( Kokkos::RangePolicy<execution_space>(0,len),functor);
|
||||
parallel_for("Kokkos::Sort::Copy", Kokkos::RangePolicy<execution_space>(0,len),functor);
|
||||
}
|
||||
}
|
||||
|
||||
template<class ValuesViewType>
|
||||
void sort( ValuesViewType const & values ) const
|
||||
{
|
||||
this->sort( values, 0, /*values.extent(0)*/ range_end - range_begin );
|
||||
}
|
||||
|
||||
// Get the permutation vector
|
||||
KOKKOS_INLINE_FUNCTION
|
||||
offset_type get_permute_vector() const { return sort_order;}
|
||||
|
@ -327,7 +340,7 @@ public:
|
|||
KOKKOS_INLINE_FUNCTION
|
||||
void operator() (const bin_count_tag& tag, const int& i) const {
|
||||
const int j = range_begin + i ;
|
||||
bin_count_atomic(bin_op.bin(keys,j))++;
|
||||
bin_count_atomic(bin_op.bin(keys, j))++;
|
||||
}
|
||||
|
||||
KOKKOS_INLINE_FUNCTION
|
||||
|
@ -512,7 +525,7 @@ void sort( ViewType const & view , bool const always_use_kokkos_sort = false)
|
|||
|
||||
Kokkos::Experimental::MinMaxScalar<typename ViewType::non_const_value_type> result;
|
||||
Kokkos::Experimental::MinMax<typename ViewType::non_const_value_type> reducer(result);
|
||||
parallel_reduce(Kokkos::RangePolicy<typename ViewType::execution_space>(0,view.extent(0)),
|
||||
parallel_reduce("Kokkos::Sort::FindExtent",Kokkos::RangePolicy<typename ViewType::execution_space>(0,view.extent(0)),
|
||||
Impl::min_max_functor<ViewType>(view),reducer);
|
||||
if(result.min_val == result.max_val) return;
|
||||
BinSort<ViewType, CompType> bin_sort(view,CompType(view.extent(0)/2,result.min_val,result.max_val),true);
|
||||
|
@ -532,7 +545,7 @@ void sort( ViewType view
|
|||
Kokkos::Experimental::MinMaxScalar<typename ViewType::non_const_value_type> result;
|
||||
Kokkos::Experimental::MinMax<typename ViewType::non_const_value_type> reducer(result);
|
||||
|
||||
parallel_reduce( range_policy( begin , end )
|
||||
parallel_reduce("Kokkos::Sort::FindExtent", range_policy( begin , end )
|
||||
, Impl::min_max_functor<ViewType>(view),reducer );
|
||||
|
||||
if(result.min_val == result.max_val) return;
|
||||
|
@ -541,8 +554,9 @@ void sort( ViewType view
|
|||
bin_sort(view,begin,end,CompType((end-begin)/2,result.min_val,result.max_val),true);
|
||||
|
||||
bin_sort.create_permute_vector();
|
||||
bin_sort.sort(view);
|
||||
bin_sort.sort(view,begin,end);
|
||||
}
|
||||
|
||||
}
|
||||
|
||||
#endif
|
||||
|
|
|
@ -35,7 +35,7 @@
|
|||
// NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE OF THIS
|
||||
// SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.
|
||||
//
|
||||
// Questions? Contact H. Carter Edwards (hcedwar@sandia.gov)
|
||||
// Questions? Contact Christian R. Trott (crtrott@sandia.gov)
|
||||
//
|
||||
// ************************************************************************
|
||||
//@HEADER
|
||||
|
@ -61,14 +61,9 @@ class cuda : public ::testing::Test {
|
|||
protected:
|
||||
static void SetUpTestCase()
|
||||
{
|
||||
std::cout << std::setprecision(5) << std::scientific;
|
||||
Kokkos::HostSpace::execution_space::initialize();
|
||||
Kokkos::Cuda::initialize( Kokkos::Cuda::SelectDevice(0) );
|
||||
}
|
||||
static void TearDownTestCase()
|
||||
{
|
||||
Kokkos::Cuda::finalize();
|
||||
Kokkos::HostSpace::execution_space::finalize();
|
||||
}
|
||||
};
|
||||
|
||||
|
|
|
@ -35,7 +35,7 @@
|
|||
// NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE OF THIS
|
||||
// SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.
|
||||
//
|
||||
// Questions? Contact H. Carter Edwards (hcedwar@sandia.gov)
|
||||
// Questions? Contact Christian R. Trott (crtrott@sandia.gov)
|
||||
//
|
||||
// ************************************************************************
|
||||
//@HEADER
|
||||
|
@ -60,25 +60,10 @@ protected:
|
|||
static void SetUpTestCase()
|
||||
{
|
||||
std::cout << std::setprecision(5) << std::scientific;
|
||||
|
||||
int threads_count = 0;
|
||||
#pragma omp parallel
|
||||
{
|
||||
#pragma omp atomic
|
||||
++threads_count;
|
||||
}
|
||||
|
||||
if (threads_count > 3) {
|
||||
threads_count /= 2;
|
||||
}
|
||||
|
||||
Kokkos::OpenMP::initialize( threads_count );
|
||||
Kokkos::OpenMP::print_configuration( std::cout );
|
||||
}
|
||||
|
||||
static void TearDownTestCase()
|
||||
{
|
||||
Kokkos::OpenMP::finalize();
|
||||
}
|
||||
};
|
||||
|
||||
|
|
|
@ -35,7 +35,7 @@
|
|||
// NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE OF THIS
|
||||
// SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.
|
||||
//
|
||||
// Questions? Contact H. Carter Edwards (hcedwar@sandia.gov)
|
||||
// Questions? Contact Christian R. Trott (crtrott@sandia.gov)
|
||||
//
|
||||
// ************************************************************************
|
||||
//@HEADER
|
||||
|
@ -62,13 +62,9 @@ protected:
|
|||
static void SetUpTestCase()
|
||||
{
|
||||
std::cout << std::setprecision(5) << std::scientific;
|
||||
Kokkos::HostSpace::execution_space::initialize();
|
||||
Kokkos::Experimental::ROCm::initialize( Kokkos::Experimental::ROCm::SelectDevice(0) );
|
||||
}
|
||||
static void TearDownTestCase()
|
||||
{
|
||||
Kokkos::Experimental::ROCm::finalize();
|
||||
Kokkos::HostSpace::execution_space::finalize();
|
||||
}
|
||||
};
|
||||
|
||||
|
|
|
@ -34,7 +34,7 @@
|
|||
// NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE OF THIS
|
||||
// SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.
|
||||
//
|
||||
// Questions? Contact H. Carter Edwards (hcedwar@sandia.gov)
|
||||
// Questions? Contact Christian R. Trott (crtrott@sandia.gov)
|
||||
//
|
||||
// ************************************************************************
|
||||
//@HEADER
|
||||
|
|
|
@ -35,7 +35,7 @@
|
|||
// NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE OF THIS
|
||||
// SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.
|
||||
//
|
||||
// Questions? Contact H. Carter Edwards (hcedwar@sandia.gov)
|
||||
// Questions? Contact Christian R. Trott (crtrott@sandia.gov)
|
||||
//
|
||||
// ************************************************************************
|
||||
//@HEADER
|
||||
|
@ -62,13 +62,10 @@ class serial : public ::testing::Test {
|
|||
protected:
|
||||
static void SetUpTestCase()
|
||||
{
|
||||
std::cout << std::setprecision (5) << std::scientific;
|
||||
Kokkos::Serial::initialize ();
|
||||
}
|
||||
|
||||
static void TearDownTestCase ()
|
||||
{
|
||||
Kokkos::Serial::finalize ();
|
||||
}
|
||||
};
|
||||
|
||||
|
|
|
@ -34,7 +34,7 @@
|
|||
// NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE OF THIS
|
||||
// SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.
|
||||
//
|
||||
// Questions? Contact H. Carter Edwards (hcedwar@sandia.gov)
|
||||
// Questions? Contact Christian R. Trott (crtrott@sandia.gov)
|
||||
//
|
||||
// ************************************************************************
|
||||
//@HEADER
|
||||
|
@ -171,10 +171,10 @@ void test_3D_sort(unsigned int n) {
|
|||
double sum_after = 0.0;
|
||||
unsigned int sort_fails = 0;
|
||||
|
||||
Kokkos::parallel_reduce(keys.dimension_0(),sum3D<ExecutionSpace, KeyType>(keys),sum_before);
|
||||
Kokkos::parallel_reduce(keys.extent(0),sum3D<ExecutionSpace, KeyType>(keys),sum_before);
|
||||
|
||||
int bin_1d = 1;
|
||||
while( bin_1d*bin_1d*bin_1d*4< (int) keys.dimension_0() ) bin_1d*=2;
|
||||
while( bin_1d*bin_1d*bin_1d*4< (int) keys.extent(0) ) bin_1d*=2;
|
||||
int bin_max[3] = {bin_1d,bin_1d,bin_1d};
|
||||
typename KeyViewType::value_type min[3] = {0,0,0};
|
||||
typename KeyViewType::value_type max[3] = {100,100,100};
|
||||
|
@ -186,8 +186,8 @@ void test_3D_sort(unsigned int n) {
|
|||
Sorter.create_permute_vector();
|
||||
Sorter.template sort< KeyViewType >(keys);
|
||||
|
||||
Kokkos::parallel_reduce(keys.dimension_0(),sum3D<ExecutionSpace, KeyType>(keys),sum_after);
|
||||
Kokkos::parallel_reduce(keys.dimension_0()-1,bin3d_is_sorted_struct<ExecutionSpace, KeyType>(keys,bin_1d,min[0],max[0]),sort_fails);
|
||||
Kokkos::parallel_reduce(keys.extent(0),sum3D<ExecutionSpace, KeyType>(keys),sum_after);
|
||||
Kokkos::parallel_reduce(keys.extent(0)-1,bin3d_is_sorted_struct<ExecutionSpace, KeyType>(keys,bin_1d,min[0],max[0]),sort_fails);
|
||||
|
||||
double ratio = sum_before/sum_after;
|
||||
double epsilon = 1e-10;
|
||||
|
@ -205,24 +205,13 @@ void test_3D_sort(unsigned int n) {
|
|||
template<class ExecutionSpace, typename KeyType>
|
||||
void test_dynamic_view_sort(unsigned int n )
|
||||
{
|
||||
typedef typename ExecutionSpace::memory_space memory_space ;
|
||||
typedef Kokkos::Experimental::DynamicView<KeyType*,ExecutionSpace> KeyDynamicViewType;
|
||||
typedef Kokkos::View<KeyType*,ExecutionSpace> KeyViewType;
|
||||
|
||||
const size_t upper_bound = 2 * n ;
|
||||
const size_t min_chunk_size = 1024;
|
||||
|
||||
const size_t total_alloc_size = n * sizeof(KeyType) * 1.2 ;
|
||||
const size_t superblock_size = std::min(total_alloc_size, size_t(1000000));
|
||||
|
||||
typename KeyDynamicViewType::memory_pool
|
||||
pool( memory_space()
|
||||
, n * sizeof(KeyType) * 1.2
|
||||
, 500 /* min block size in bytes */
|
||||
, 30000 /* max block size in bytes */
|
||||
, superblock_size
|
||||
);
|
||||
|
||||
KeyDynamicViewType keys("Keys",pool,upper_bound);
|
||||
KeyDynamicViewType keys("Keys", min_chunk_size, upper_bound);
|
||||
|
||||
keys.resize_serial(n);
|
||||
|
||||
|
@ -230,13 +219,15 @@ void test_dynamic_view_sort(unsigned int n )
|
|||
|
||||
// Test sorting array with all numbers equal
|
||||
Kokkos::deep_copy(keys_view,KeyType(1));
|
||||
Kokkos::Experimental::deep_copy(keys,keys_view);
|
||||
Kokkos::deep_copy(keys,keys_view);
|
||||
Kokkos::sort(keys, 0 /* begin */ , n /* end */ );
|
||||
|
||||
Kokkos::Random_XorShift64_Pool<ExecutionSpace> g(1931);
|
||||
Kokkos::fill_random(keys_view,g,Kokkos::Random_XorShift64_Pool<ExecutionSpace>::generator_type::MAX_URAND);
|
||||
|
||||
Kokkos::Experimental::deep_copy(keys,keys_view);
|
||||
ExecutionSpace::fence();
|
||||
Kokkos::deep_copy(keys,keys_view);
|
||||
//ExecutionSpace::fence();
|
||||
|
||||
double sum_before = 0.0;
|
||||
double sum_after = 0.0;
|
||||
|
@ -246,7 +237,9 @@ void test_dynamic_view_sort(unsigned int n )
|
|||
|
||||
Kokkos::sort(keys, 0 /* begin */ , n /* end */ );
|
||||
|
||||
Kokkos::Experimental::deep_copy( keys_view , keys );
|
||||
ExecutionSpace::fence(); // Need this fence to prevent BusError with Cuda
|
||||
Kokkos::deep_copy( keys_view , keys );
|
||||
//ExecutionSpace::fence();
|
||||
|
||||
Kokkos::parallel_reduce(n,sum<ExecutionSpace, KeyType>(keys_view),sum_after);
|
||||
Kokkos::parallel_reduce(n-1,is_sorted_struct<ExecutionSpace, KeyType>(keys_view),sort_fails);
|
||||
|
@ -269,6 +262,74 @@ void test_dynamic_view_sort(unsigned int n )
|
|||
|
||||
//----------------------------------------------------------------------------
|
||||
|
||||
template<class ExecutionSpace>
|
||||
void test_issue_1160()
|
||||
{
|
||||
Kokkos::View<int*, ExecutionSpace> element_("element", 10);
|
||||
Kokkos::View<double*, ExecutionSpace> x_("x", 10);
|
||||
Kokkos::View<double*, ExecutionSpace> v_("y", 10);
|
||||
|
||||
auto h_element = Kokkos::create_mirror_view(element_);
|
||||
auto h_x = Kokkos::create_mirror_view(x_);
|
||||
auto h_v = Kokkos::create_mirror_view(v_);
|
||||
|
||||
h_element(0) = 9;
|
||||
h_element(1) = 8;
|
||||
h_element(2) = 7;
|
||||
h_element(3) = 6;
|
||||
h_element(4) = 5;
|
||||
h_element(5) = 4;
|
||||
h_element(6) = 3;
|
||||
h_element(7) = 2;
|
||||
h_element(8) = 1;
|
||||
h_element(9) = 0;
|
||||
|
||||
for (int i = 0; i < 10; ++i) {
|
||||
h_v.access(i, 0) = h_x.access(i, 0) = double(h_element(i));
|
||||
}
|
||||
Kokkos::deep_copy(element_, h_element);
|
||||
Kokkos::deep_copy(x_, h_x);
|
||||
Kokkos::deep_copy(v_, h_v);
|
||||
|
||||
typedef decltype(element_) KeyViewType;
|
||||
typedef Kokkos::BinOp1D< KeyViewType > BinOp;
|
||||
|
||||
int begin = 3;
|
||||
int end = 8;
|
||||
auto max = h_element(begin);
|
||||
auto min = h_element(end - 1);
|
||||
BinOp binner(end - begin, min, max);
|
||||
|
||||
Kokkos::BinSort<KeyViewType , BinOp > Sorter(element_,begin,end,binner,false);
|
||||
Sorter.create_permute_vector();
|
||||
Sorter.sort(element_,begin,end);
|
||||
|
||||
Sorter.sort(x_,begin,end);
|
||||
Sorter.sort(v_,begin,end);
|
||||
|
||||
Kokkos::deep_copy(h_element, element_);
|
||||
Kokkos::deep_copy(h_x, x_);
|
||||
Kokkos::deep_copy(h_v, v_);
|
||||
|
||||
ASSERT_EQ(h_element(0), 9);
|
||||
ASSERT_EQ(h_element(1), 8);
|
||||
ASSERT_EQ(h_element(2), 7);
|
||||
ASSERT_EQ(h_element(3), 2);
|
||||
ASSERT_EQ(h_element(4), 3);
|
||||
ASSERT_EQ(h_element(5), 4);
|
||||
ASSERT_EQ(h_element(6), 5);
|
||||
ASSERT_EQ(h_element(7), 6);
|
||||
ASSERT_EQ(h_element(8), 1);
|
||||
ASSERT_EQ(h_element(9), 0);
|
||||
|
||||
for (int i = 0; i < 10; ++i) {
|
||||
ASSERT_EQ(h_element(i), int(h_x.access(i, 0)));
|
||||
ASSERT_EQ(h_element(i), int(h_v.access(i, 0)));
|
||||
}
|
||||
}
|
||||
|
||||
//----------------------------------------------------------------------------
|
||||
|
||||
template<class ExecutionSpace, typename KeyType>
|
||||
void test_sort(unsigned int N)
|
||||
{
|
||||
|
@ -278,6 +339,7 @@ void test_sort(unsigned int N)
|
|||
test_3D_sort<ExecutionSpace,KeyType>(N);
|
||||
test_dynamic_view_sort<ExecutionSpace,KeyType>(N*N);
|
||||
#endif
|
||||
test_issue_1160<ExecutionSpace>();
|
||||
}
|
||||
|
||||
}
|
||||
|
|
|
@ -35,7 +35,7 @@
|
|||
// NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE OF THIS
|
||||
// SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.
|
||||
//
|
||||
// Questions? Contact H. Carter Edwards (hcedwar@sandia.gov)
|
||||
// Questions? Contact Christian R. Trott (crtrott@sandia.gov)
|
||||
//
|
||||
// ************************************************************************
|
||||
//@HEADER
|
||||
|
@ -63,25 +63,10 @@ protected:
|
|||
static void SetUpTestCase()
|
||||
{
|
||||
std::cout << std::setprecision(5) << std::scientific;
|
||||
|
||||
unsigned num_threads = 4;
|
||||
|
||||
if (Kokkos::hwloc::available()) {
|
||||
num_threads = Kokkos::hwloc::get_available_numa_count()
|
||||
* Kokkos::hwloc::get_available_cores_per_numa()
|
||||
// * Kokkos::hwloc::get_available_threads_per_core()
|
||||
;
|
||||
|
||||
}
|
||||
|
||||
std::cout << "Threads: " << num_threads << std::endl;
|
||||
|
||||
Kokkos::Threads::initialize( num_threads );
|
||||
}
|
||||
|
||||
static void TearDownTestCase()
|
||||
{
|
||||
Kokkos::Threads::finalize();
|
||||
}
|
||||
};
|
||||
|
||||
|
|
|
@ -35,16 +35,20 @@
|
|||
// NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE OF THIS
|
||||
// SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.
|
||||
//
|
||||
// Questions? Contact H. Carter Edwards (hcedwar@sandia.gov)
|
||||
// Questions? Contact Christian R. Trott (crtrott@sandia.gov)
|
||||
//
|
||||
// ************************************************************************
|
||||
//@HEADER
|
||||
*/
|
||||
|
||||
#include <gtest/gtest.h>
|
||||
#include <Kokkos_Core.hpp>
|
||||
|
||||
int main(int argc, char *argv[]) {
|
||||
Kokkos::initialize(argc,argv);
|
||||
::testing::InitGoogleTest(&argc,argv);
|
||||
return RUN_ALL_TESTS();
|
||||
int result = RUN_ALL_TESTS();
|
||||
Kokkos::finalize();
|
||||
return result;
|
||||
}
|
||||
|
||||
|
|
|
@ -10,7 +10,7 @@ default: build
|
|||
|
||||
|
||||
ifneq (,$(findstring Cuda,$(KOKKOS_DEVICES)))
|
||||
CXX = ${KOKKOS_PATH}/config/nvcc_wrapper
|
||||
CXX = ${KOKKOS_PATH}/bin/nvcc_wrapper
|
||||
EXE = ${EXE_NAME}.cuda
|
||||
KOKKOS_CUDA_OPTIONS = "enable_lambda"
|
||||
else
|
||||
|
|
|
@ -3,7 +3,7 @@
|
|||
# BytesAndFlops
|
||||
cd build/bytes_and_flops
|
||||
|
||||
USE_CUDA=`grep "_CUDA 1" KokkosCore_config.h | wc -l`
|
||||
USE_CUDA=`grep "_CUDA" KokkosCore_config.h | wc -l`
|
||||
|
||||
if [[ ${USE_CUDA} > 0 ]]; then
|
||||
BAF_EXE=bytes_and_flops.cuda
|
||||
|
@ -41,4 +41,4 @@ cd ../..
|
|||
echo "MiniFE: ${FE_PERF_1} ${FE_PERF_2}"
|
||||
|
||||
PERF_RESULT=`echo "${BAF_PERF_1} ${BAF_PERF_2} ${MD_PERF_1} ${MD_PERF_2} ${FE_PERF_1} ${FE_PERF_2}" | awk '{print ($1+$2+$3+$4+$5+$6)/6}'`
|
||||
echo "Total Result: " ${PERF_RESULT}
|
||||
echo "Total Result: " ${PERF_RESULT}
|
||||
|
|
|
@ -35,7 +35,7 @@
|
|||
// NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE OF THIS
|
||||
// SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.
|
||||
//
|
||||
// Questions? Contact H. Carter Edwards (hcedwar@sandia.gov)
|
||||
// Questions? Contact Christian R. Trott (crtrott@sandia.gov)
|
||||
//
|
||||
// ************************************************************************
|
||||
//@HEADER
|
||||
|
|
|
@ -35,7 +35,7 @@
|
|||
// NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE OF THIS
|
||||
// SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.
|
||||
//
|
||||
// Questions? Contact H. Carter Edwards (hcedwar@sandia.gov)
|
||||
// Questions? Contact Christian R. Trott (crtrott@sandia.gov)
|
||||
//
|
||||
// ************************************************************************
|
||||
//@HEADER
|
||||
|
|
|
@ -35,7 +35,7 @@
|
|||
// NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE OF THIS
|
||||
// SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.
|
||||
//
|
||||
// Questions? Contact H. Carter Edwards (hcedwar@sandia.gov)
|
||||
// Questions? Contact Christian R. Trott (crtrott@sandia.gov)
|
||||
//
|
||||
// ************************************************************************
|
||||
//@HEADER
|
||||
|
|
|
@ -35,7 +35,7 @@
|
|||
// NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE OF THIS
|
||||
// SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.
|
||||
//
|
||||
// Questions? Contact H. Carter Edwards (hcedwar@sandia.gov)
|
||||
// Questions? Contact Christian R. Trott (crtrott@sandia.gov)
|
||||
//
|
||||
// ************************************************************************
|
||||
//@HEADER
|
||||
|
|
|
@ -35,7 +35,7 @@
|
|||
// NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE OF THIS
|
||||
// SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.
|
||||
//
|
||||
// Questions? Contact H. Carter Edwards (hcedwar@sandia.gov)
|
||||
// Questions? Contact Christian R. Trott (crtrott@sandia.gov)
|
||||
//
|
||||
// ************************************************************************
|
||||
//@HEADER
|
||||
|
|
|
@ -35,7 +35,7 @@
|
|||
// NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE OF THIS
|
||||
// SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.
|
||||
//
|
||||
// Questions? Contact H. Carter Edwards (hcedwar@sandia.gov)
|
||||
// Questions? Contact Christian R. Trott (crtrott@sandia.gov)
|
||||
//
|
||||
// ************************************************************************
|
||||
//@HEADER
|
||||
|
|
|
@ -35,7 +35,7 @@
|
|||
// NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE OF THIS
|
||||
// SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.
|
||||
//
|
||||
// Questions? Contact H. Carter Edwards (hcedwar@sandia.gov)
|
||||
// Questions? Contact Christian R. Trott (crtrott@sandia.gov)
|
||||
//
|
||||
// ************************************************************************
|
||||
//@HEADER
|
||||
|
|
|
@ -35,7 +35,7 @@
|
|||
// NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE OF THIS
|
||||
// SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.
|
||||
//
|
||||
// Questions? Contact H. Carter Edwards (hcedwar@sandia.gov)
|
||||
// Questions? Contact Christian R. Trott (crtrott@sandia.gov)
|
||||
//
|
||||
// ************************************************************************
|
||||
//@HEADER
|
||||
|
|
|
@ -35,7 +35,7 @@
|
|||
// NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE OF THIS
|
||||
// SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.
|
||||
//
|
||||
// Questions? Contact H. Carter Edwards (hcedwar@sandia.gov)
|
||||
// Questions? Contact Christian R. Trott (crtrott@sandia.gov)
|
||||
//
|
||||
// ************************************************************************
|
||||
//@HEADER
|
||||
|
|
|
@ -2,7 +2,7 @@
|
|||
# FindHWLOC
|
||||
# ----------
|
||||
#
|
||||
# Try to find HWLOC.
|
||||
# Try to find HWLOC, based on KOKKOS_HWLOC_DIR
|
||||
#
|
||||
# The following variables are defined:
|
||||
#
|
||||
|
@ -10,8 +10,8 @@
|
|||
# HWLOC_INCLUDE_DIR - HWLOC include directory
|
||||
# HWLOC_LIBRARIES - Libraries needed to use HWLOC
|
||||
|
||||
find_path(HWLOC_INCLUDE_DIR hwloc.h)
|
||||
find_library(HWLOC_LIBRARIES hwloc)
|
||||
find_path(HWLOC_INCLUDE_DIR hwloc.h PATHS "${KOKKOS_HWLOC_DIR}/include")
|
||||
find_library(HWLOC_LIBRARIES hwloc PATHS "${KOKKOS_HWLOC_DIR}/lib")
|
||||
|
||||
include(FindPackageHandleStandardArgs)
|
||||
find_package_handle_standard_args(HWLOC DEFAULT_MSG
|
||||
|
|
|
@ -1,7 +1,3 @@
|
|||
# kokkos_generated_settings.cmake includes the kokkos library itself in KOKKOS_LIBS
|
||||
# which we do not want to use for the cmake builds so clean this up
|
||||
string(REGEX REPLACE "-lkokkos" "" KOKKOS_LIBS ${KOKKOS_LIBS})
|
||||
|
||||
############################ Detect if submodule ###############################
|
||||
#
|
||||
# With thanks to StackOverflow:
|
||||
|
@ -73,6 +69,19 @@ IF(KOKKOS_SEPARATE_LIBS)
|
|||
PUBLIC $<$<COMPILE_LANGUAGE:CXX>:${KOKKOS_CXX_FLAGS}>
|
||||
)
|
||||
|
||||
target_include_directories(
|
||||
kokkoscore
|
||||
PUBLIC
|
||||
${KOKKOS_TPL_INCLUDE_DIRS}
|
||||
)
|
||||
|
||||
foreach(lib IN LISTS KOKKOS_TPL_LIBRARY_NAMES)
|
||||
find_library(LIB_${lib} ${lib} PATHS ${KOKKOS_TPL_LIBRARY_DIRS})
|
||||
target_link_libraries(kokkoscore PUBLIC ${LIB_${lib}})
|
||||
endforeach()
|
||||
|
||||
target_link_libraries(kokkoscore PUBLIC "${KOKKOS_LINK_FLAGS}")
|
||||
|
||||
# Install the kokkoscore library
|
||||
INSTALL (TARGETS kokkoscore
|
||||
EXPORT KokkosTargets
|
||||
|
@ -81,12 +90,6 @@ IF(KOKKOS_SEPARATE_LIBS)
|
|||
RUNTIME DESTINATION ${CMAKE_INSTALL_PREFIX}/bin
|
||||
)
|
||||
|
||||
TARGET_LINK_LIBRARIES(
|
||||
kokkoscore
|
||||
${KOKKOS_LD_FLAGS}
|
||||
${KOKKOS_EXTRA_LIBS_LIST}
|
||||
)
|
||||
|
||||
# kokkoscontainers
|
||||
if (DEFINED KOKKOS_CONTAINERS_SRCS)
|
||||
ADD_LIBRARY(
|
||||
|
@ -144,12 +147,19 @@ ELSE()
|
|||
PUBLIC $<$<COMPILE_LANGUAGE:CXX>:${KOKKOS_CXX_FLAGS}>
|
||||
)
|
||||
|
||||
TARGET_LINK_LIBRARIES(
|
||||
target_include_directories(
|
||||
kokkos
|
||||
${KOKKOS_LD_FLAGS}
|
||||
${KOKKOS_EXTRA_LIBS_LIST}
|
||||
PUBLIC
|
||||
${KOKKOS_TPL_INCLUDE_DIRS}
|
||||
)
|
||||
|
||||
foreach(lib IN LISTS KOKKOS_TPL_LIBRARY_NAMES)
|
||||
find_library(LIB_${lib} ${lib} PATHS ${KOKKOS_TPL_LIBRARY_DIRS})
|
||||
target_link_libraries(kokkos PUBLIC ${LIB_${lib}})
|
||||
endforeach()
|
||||
|
||||
target_link_libraries(kokkos PUBLIC "${KOKKOS_LINK_FLAGS}")
|
||||
|
||||
# Install the kokkos library
|
||||
INSTALL (TARGETS kokkos
|
||||
EXPORT KokkosTargets
|
||||
|
|
|
@ -25,11 +25,12 @@ list(APPEND KOKKOS_INTERNAL_ENABLE_OPTIONS_LIST
|
|||
Cuda_LDG_Intrinsic
|
||||
Debug
|
||||
Debug_DualView_Modify_Check
|
||||
Debug_Bounds_Checkt
|
||||
Debug_Bounds_Check
|
||||
Compiler_Warnings
|
||||
Profiling
|
||||
Profiling_Load_Print
|
||||
Aggressive_Vectorization
|
||||
Deprecated_Code
|
||||
)
|
||||
|
||||
#-------------------------------------------------------------------------------
|
||||
|
@ -263,7 +264,8 @@ set(KOKKOS_ENABLE_PROFILING ${KOKKOS_INTERNAL_ENABLE_PROFILING_DEFAULT} CACHE BO
|
|||
set_kokkos_default_default(PROFILING_LOAD_PRINT OFF)
|
||||
set(KOKKOS_ENABLE_PROFILING_LOAD_PRINT ${KOKKOS_INTERNAL_ENABLE_PROFILING_LOAD_PRINT_DEFAULT} CACHE BOOL "Enable profile load print.")
|
||||
|
||||
|
||||
set_kokkos_default_default(DEPRECATED_CODE ON)
|
||||
set(KOKKOS_ENABLE_DEPRECATED_CODE ${KOKKOS_INTERNAL_ENABLE_DEPRECATED_CODE_DEFAULT} CACHE BOOL "Enable deprecated code.")
|
||||
|
||||
|
||||
#-------------------------------------------------------------------------------
|
||||
|
|
|
@ -14,6 +14,13 @@
|
|||
#-------------------------------------------------------------------------------
|
||||
|
||||
# Ensure that KOKKOS_ARCH is in the ARCH_LIST
|
||||
if (KOKKOS_ARCH MATCHES ",")
|
||||
message("-- Detected a comma in: KOKKOS_ARCH=${KOKKOS_ARCH}")
|
||||
message("-- Although we prefer KOKKOS_ARCH to be semicolon-delimited, we do allow")
|
||||
message("-- comma-delimited values for compatibility with scripts (see github.com/trilinos/Trilinos/issues/2330)")
|
||||
string(REPLACE "," ";" KOKKOS_ARCH "${KOKKOS_ARCH}")
|
||||
message("-- Commas were changed to semicolons, now KOKKOS_ARCH=${KOKKOS_ARCH}")
|
||||
endif()
|
||||
foreach(arch ${KOKKOS_ARCH})
|
||||
list(FIND KOKKOS_ARCH_LIST ${arch} indx)
|
||||
if (indx EQUAL -1)
|
||||
|
@ -23,14 +30,13 @@ foreach(arch ${KOKKOS_ARCH})
|
|||
endforeach()
|
||||
|
||||
# KOKKOS_SETTINGS uses KOKKOS_ARCH
|
||||
string(REPLACE ";" "," KOKKOS_ARCH "${KOKKOS_ARCH}")
|
||||
set(KOKKOS_ARCH ${KOKKOS_ARCH})
|
||||
string(REPLACE ";" "," KOKKOS_GMAKE_ARCH "${KOKKOS_ARCH}")
|
||||
|
||||
# From Makefile.kokkos: Options: yes,no
|
||||
if(${KOKKOS_ENABLE_DEBUG})
|
||||
set(KOKKOS_DEBUG yes)
|
||||
set(KOKKOS_GMAKE_DEBUG yes)
|
||||
else()
|
||||
set(KOKKOS_DEBUG no)
|
||||
set(KOKKOS_GMAKE_DEBUG no)
|
||||
endif()
|
||||
|
||||
#------------------------------- KOKKOS_DEVICES --------------------------------
|
||||
|
@ -43,10 +49,10 @@ foreach(devopt ${KOKKOS_DEVICES_LIST})
|
|||
endif ()
|
||||
endforeach()
|
||||
# List needs to be comma-delmitted
|
||||
string(REPLACE ";" "," KOKKOS_DEVICES "${KOKKOS_DEVICESl}")
|
||||
string(REPLACE ";" "," KOKKOS_GMAKE_DEVICES "${KOKKOS_DEVICESl}")
|
||||
|
||||
#------------------------------- KOKKOS_OPTIONS --------------------------------
|
||||
# From Makefile.kokkos: Options: aggressive_vectorization,disable_profiling
|
||||
# From Makefile.kokkos: Options: aggressive_vectorization,disable_profiling,disable_deprecated_code
|
||||
#compiler_warnings, aggressive_vectorization, disable_profiling, disable_dualview_modify_check, enable_profile_load_print
|
||||
|
||||
set(KOKKOS_OPTIONSl)
|
||||
|
@ -57,7 +63,10 @@ if(${KOKKOS_ENABLE_AGGRESSIVE_VECTORIZATION})
|
|||
list(APPEND KOKKOS_OPTIONSl aggressive_vectorization)
|
||||
endif()
|
||||
if(NOT ${KOKKOS_ENABLE_PROFILING})
|
||||
list(APPEND KOKKOS_OPTIONSl disable_vectorization)
|
||||
list(APPEND KOKKOS_OPTIONSl disable_profiling)
|
||||
endif()
|
||||
if(NOT ${KOKKOS_ENABLE_DEPRECATED_CODE})
|
||||
list(APPEND KOKKOS_OPTIONSl disable_deprecated_code)
|
||||
endif()
|
||||
if(NOT ${KOKKOS_ENABLE_DEBUG_DUALVIEW_MODIFY_CHECK})
|
||||
list(APPEND KOKKOS_OPTIONSl disable_dualview_modify_check)
|
||||
|
@ -66,7 +75,7 @@ if(${KOKKOS_ENABLE_PROFILING_LOAD_PRINT})
|
|||
list(APPEND KOKKOS_OPTIONSl enable_profile_load_print)
|
||||
endif()
|
||||
# List needs to be comma-delimitted
|
||||
string(REPLACE ";" "," KOKKOS_OPTIONS "${KOKKOS_OPTIONSl}")
|
||||
string(REPLACE ";" "," KOKKOS_GMAKE_OPTIONS "${KOKKOS_OPTIONSl}")
|
||||
|
||||
|
||||
#------------------------------- KOKKOS_USE_TPLS -------------------------------
|
||||
|
@ -78,19 +87,19 @@ foreach(tplopt ${KOKKOS_USE_TPLS_LIST})
|
|||
endif ()
|
||||
endforeach()
|
||||
# List needs to be comma-delimitted
|
||||
string(REPLACE ";" "," KOKKOS_USE_TPLS "${KOKKOS_USE_TPLSl}")
|
||||
string(REPLACE ";" "," KOKKOS_GMAKE_USE_TPLS "${KOKKOS_USE_TPLSl}")
|
||||
|
||||
|
||||
#------------------------------- KOKKOS_CUDA_OPTIONS ---------------------------
|
||||
# Construct the Makefile options
|
||||
set(KOKKOS_CUDA_OPTIONS)
|
||||
set(KOKKOS_CUDA_OPTIONSl)
|
||||
foreach(cudaopt ${KOKKOS_CUDA_OPTIONS_LIST})
|
||||
if (${KOKKOS_ENABLE_CUDA_${cudaopt}})
|
||||
list(APPEND KOKKOS_CUDA_OPTIONSl ${KOKKOS_INTERNAL_${cudaopt}})
|
||||
endif ()
|
||||
endforeach()
|
||||
# List needs to be comma-delmitted
|
||||
string(REPLACE ";" "," KOKKOS_CUDA_OPTIONS "${KOKKOS_CUDA_OPTIONSl}")
|
||||
string(REPLACE ";" "," KOKKOS_GMAKE_CUDA_OPTIONS "${KOKKOS_CUDA_OPTIONSl}")
|
||||
|
||||
#------------------------------- PATH VARIABLES --------------------------------
|
||||
# Want makefile to use same executables specified which means modifying
|
||||
|
@ -100,10 +109,10 @@ string(REPLACE ";" "," KOKKOS_CUDA_OPTIONS "${KOKKOS_CUDA_OPTIONSl}")
|
|||
|
||||
set(KOKKOS_INTERNAL_PATHS)
|
||||
set(addpathl)
|
||||
foreach(kvar "CUDA;QTHREADS;${KOKKOS_USE_TPLS_LIST}")
|
||||
foreach(kvar IN LISTS KOKKOS_USE_TPLS_LIST ITEMS CUDA QTHREADS)
|
||||
if(${KOKKOS_ENABLE_${kvar}})
|
||||
if(DEFINED KOKKOS_${kvar}_DIR)
|
||||
set(KOKKOS_INTERNAL_PATHS "${KOKKOS_INTERNAL_PATHS} ${kvar}_PATH=${KOKKOS_${kvar}_DIR}")
|
||||
set(KOKKOS_INTERNAL_PATHS ${KOKKOS_INTERNAL_PATHS} "${kvar}_PATH=${KOKKOS_${kvar}_DIR}")
|
||||
if(IS_DIRECTORY ${KOKKOS_${kvar}_DIR}/bin)
|
||||
list(APPEND addpathl ${KOKKOS_${kvar}_DIR}/bin)
|
||||
endif()
|
||||
|
@ -124,10 +133,9 @@ set(KOKKOS_SETTINGS ${KOKKOS_SETTINGS} KOKKOS_INSTALL_PATH=${CMAKE_INSTALL_PREFI
|
|||
|
||||
# Form of KOKKOS_foo=$KOKKOS_foo
|
||||
foreach(kvar ARCH;DEVICES;DEBUG;OPTIONS;CUDA_OPTIONS;USE_TPLS)
|
||||
set(KOKKOS_VAR KOKKOS_${kvar})
|
||||
if(DEFINED KOKKOS_${kvar})
|
||||
if (NOT "${${KOKKOS_VAR}}" STREQUAL "")
|
||||
set(KOKKOS_SETTINGS ${KOKKOS_SETTINGS} ${KOKKOS_VAR}=${${KOKKOS_VAR}})
|
||||
if(DEFINED KOKKOS_GMAKE_${kvar})
|
||||
if (NOT "${KOKKOS_GMAKE_${kvar}}" STREQUAL "")
|
||||
set(KOKKOS_SETTINGS ${KOKKOS_SETTINGS} KOKKOS_${kvar}=${KOKKOS_GMAKE_${kvar}})
|
||||
endif()
|
||||
endif()
|
||||
endforeach()
|
||||
|
@ -147,7 +155,7 @@ if (NOT "${KOKKOS_INTERNAL_PATHS}" STREQUAL "")
|
|||
set(KOKKOS_SETTINGS ${KOKKOS_SETTINGS} ${KOKKOS_INTERNAL_PATHS})
|
||||
endif()
|
||||
if (NOT "${KOKKOS_INTERNAL_ADDTOPATH}" STREQUAL "")
|
||||
set(KOKKOS_SETTINGS ${KOKKOS_SETTINGS} PATH=${KOKKOS_INTERNAL_ADDTOPATH}:\${PATH})
|
||||
set(KOKKOS_SETTINGS ${KOKKOS_SETTINGS} "PATH=\"${KOKKOS_INTERNAL_ADDTOPATH}:$ENV{PATH}\"")
|
||||
endif()
|
||||
|
||||
# Final form that gets passed to make
|
||||
|
@ -185,7 +193,7 @@ if(KOKKOS_CMAKE_VERBOSE)
|
|||
|
||||
message(STATUS "")
|
||||
message(STATUS "Architectures:")
|
||||
message(STATUS " ${KOKKOS_ARCH}")
|
||||
message(STATUS " ${KOKKOS_GMAKE_ARCH}")
|
||||
|
||||
message(STATUS "")
|
||||
message(STATUS "Enabled options")
|
||||
|
@ -194,43 +202,14 @@ if(KOKKOS_CMAKE_VERBOSE)
|
|||
message(STATUS " KOKKOS_SEPARATE_LIBS")
|
||||
endif()
|
||||
|
||||
if(KOKKOS_ENABLE_HWLOC)
|
||||
message(STATUS " KOKKOS_ENABLE_HWLOC")
|
||||
endif()
|
||||
|
||||
if(KOKKOS_ENABLE_MEMKIND)
|
||||
message(STATUS " KOKKOS_ENABLE_MEMKIND")
|
||||
endif()
|
||||
|
||||
if(KOKKOS_ENABLE_DEBUG)
|
||||
message(STATUS " KOKKOS_ENABLE_DEBUG")
|
||||
endif()
|
||||
|
||||
if(KOKKOS_ENABLE_PROFILING)
|
||||
message(STATUS " KOKKOS_ENABLE_PROFILING")
|
||||
endif()
|
||||
|
||||
if(KOKKOS_ENABLE_AGGRESSIVE_VECTORIZATION)
|
||||
message(STATUS " KOKKOS_ENABLE_AGGRESSIVE_VECTORIZATION")
|
||||
endif()
|
||||
foreach(opt IN LISTS KOKKOS_INTERNAL_ENABLE_OPTIONS_LIST)
|
||||
string(TOUPPER ${opt} OPT)
|
||||
if (KOKKOS_ENABLE_${OPT})
|
||||
message(STATUS " KOKKOS_ENABLE_${OPT}")
|
||||
endif()
|
||||
endforeach()
|
||||
|
||||
if(KOKKOS_ENABLE_CUDA)
|
||||
if(KOKKOS_ENABLE_CUDA_LDG_INTRINSIC)
|
||||
message(STATUS " KOKKOS_ENABLE_CUDA_LDG_INTRINSIC")
|
||||
endif()
|
||||
|
||||
if(KOKKOS_ENABLE_CUDA_UVM)
|
||||
message(STATUS " KOKKOS_ENABLE_CUDA_UVM")
|
||||
endif()
|
||||
|
||||
if(KOKKOS_ENABLE_CUDA_RELOCATABLE_DEVICE_CODE)
|
||||
message(STATUS " KOKKOS_ENABLE_CUDA_RELOCATABLE_DEVICE_CODE")
|
||||
endif()
|
||||
|
||||
if(KOKKOS_ENABLE_CUDA_LAMBDA)
|
||||
message(STATUS " KOKKOS_ENABLE_CUDA_LAMBDA")
|
||||
endif()
|
||||
|
||||
if(KOKKOS_CUDA_DIR)
|
||||
message(STATUS " KOKKOS_CUDA_DIR: ${KOKKOS_CUDA_DIR}")
|
||||
endif()
|
||||
|
|
|
@ -3,7 +3,7 @@ INCLUDE(CTest)
|
|||
|
||||
cmake_policy(SET CMP0054 NEW)
|
||||
|
||||
MESSAGE(WARNING "The project name is: ${PROJECT_NAME}")
|
||||
MESSAGE(STATUS "The project name is: ${PROJECT_NAME}")
|
||||
|
||||
IF(NOT DEFINED ${PROJECT_NAME}_ENABLE_OpenMP)
|
||||
SET(${PROJECT_NAME}_ENABLE_OpenMP OFF)
|
||||
|
@ -84,9 +84,6 @@ ENDFUNCTION()
|
|||
|
||||
|
||||
MACRO(TRIBITS_ADD_TEST_DIRECTORIES)
|
||||
message(STATUS "ProjectName: " ${PROJECT_NAME})
|
||||
message(STATUS "Tests: " ${${PROJECT_NAME}_ENABLE_TESTS})
|
||||
|
||||
IF(${${PROJECT_NAME}_ENABLE_TESTS})
|
||||
FOREACH(TEST_DIR ${ARGN})
|
||||
ADD_SUBDIRECTORY(${TEST_DIR})
|
||||
|
@ -95,13 +92,11 @@ MACRO(TRIBITS_ADD_TEST_DIRECTORIES)
|
|||
ENDMACRO()
|
||||
|
||||
MACRO(TRIBITS_ADD_EXAMPLE_DIRECTORIES)
|
||||
|
||||
IF(${PACKAGE_NAME}_ENABLE_EXAMPLES OR ${PARENT_PACKAGE_NAME}_ENABLE_EXAMPLES)
|
||||
FOREACH(EXAMPLE_DIR ${ARGN})
|
||||
ADD_SUBDIRECTORY(${EXAMPLE_DIR})
|
||||
ENDFOREACH()
|
||||
ENDIF()
|
||||
|
||||
ENDMACRO()
|
||||
|
||||
|
||||
|
|
|
@ -1,190 +0,0 @@
|
|||
#!/bin/sh
|
||||
#
|
||||
# Copy this script, put it outside the Trilinos source directory, and
|
||||
# build there.
|
||||
#
|
||||
# Additional command-line arguments given to this script will be
|
||||
# passed directly to CMake.
|
||||
#
|
||||
|
||||
#
|
||||
# Force CMake to re-evaluate build options.
|
||||
#
|
||||
rm -rf CMake* Trilinos* packages Dart* Testing cmake_install.cmake MakeFile*
|
||||
|
||||
#-----------------------------------------------------------------------------
|
||||
# Incrementally construct cmake configure options:
|
||||
|
||||
CMAKE_CONFIGURE=""
|
||||
|
||||
#-----------------------------------------------------------------------------
|
||||
# Location of Trilinos source tree:
|
||||
|
||||
CMAKE_PROJECT_DIR="${HOME}/Trilinos"
|
||||
|
||||
# Location for installation:
|
||||
|
||||
CMAKE_CONFIGURE="${CMAKE_CONFIGURE} -D CMAKE_INSTALL_PREFIX=/home/projects/kokkos/host/`date +%F`"
|
||||
|
||||
#-----------------------------------------------------------------------------
|
||||
# General build options.
|
||||
# Use a variable so options can be propagated to CUDA compiler.
|
||||
|
||||
CMAKE_VERBOSE_MAKEFILE=OFF
|
||||
CMAKE_BUILD_TYPE=RELEASE
|
||||
# CMAKE_BUILD_TYPE=DEBUG
|
||||
|
||||
#-----------------------------------------------------------------------------
|
||||
# Build for CUDA architecture:
|
||||
|
||||
CUDA_ARCH=""
|
||||
# CUDA_ARCH="20"
|
||||
# CUDA_ARCH="30"
|
||||
# CUDA_ARCH="35"
|
||||
|
||||
# Build with Intel compiler
|
||||
|
||||
INTEL=ON
|
||||
|
||||
# Build for MIC architecture:
|
||||
|
||||
# INTEL_XEON_PHI=ON
|
||||
|
||||
# Build with HWLOC at location:
|
||||
|
||||
HWLOC_BASE_DIR="/home/projects/libraries/host/hwloc/1.6.2"
|
||||
|
||||
# Location for MPI to use in examples:
|
||||
|
||||
MPI_BASE_DIR=""
|
||||
|
||||
#-----------------------------------------------------------------------------
|
||||
# MPI configuation only used for examples:
|
||||
#
|
||||
# Must have the MPI_BASE_DIR so that the
|
||||
# include path can be passed to the Cuda compiler
|
||||
|
||||
if [ -n "${MPI_BASE_DIR}" ] ;
|
||||
then
|
||||
CMAKE_CONFIGURE="${CMAKE_CONFIGURE} -D TPL_ENABLE_MPI:BOOL=ON"
|
||||
CMAKE_CONFIGURE="${CMAKE_CONFIGURE} -D MPI_BASE_DIR:PATH=${MPI_BASE_DIR}"
|
||||
else
|
||||
CMAKE_CONFIGURE="${CMAKE_CONFIGURE} -D TPL_ENABLE_MPI:BOOL=OFF"
|
||||
fi
|
||||
|
||||
#-----------------------------------------------------------------------------
|
||||
# Pthread configuation:
|
||||
|
||||
CMAKE_CONFIGURE="${CMAKE_CONFIGURE} -D TPL_ENABLE_Pthread:BOOL=ON"
|
||||
# CMAKE_CONFIGURE="${CMAKE_CONFIGURE} -D TPL_ENABLE_Pthread:BOOL=OFF"
|
||||
|
||||
#-----------------------------------------------------------------------------
|
||||
# OpenMP configuation:
|
||||
|
||||
# CMAKE_CONFIGURE="${CMAKE_CONFIGURE} -D Trilinos_ENABLE_OpenMP:BOOL=ON"
|
||||
CMAKE_CONFIGURE="${CMAKE_CONFIGURE} -D Trilinos_ENABLE_OpenMP:BOOL=OFF"
|
||||
|
||||
#-----------------------------------------------------------------------------
|
||||
#-----------------------------------------------------------------------------
|
||||
# Configure packages for kokkos-only:
|
||||
|
||||
CMAKE_CONFIGURE="${CMAKE_CONFIGURE} -D Trilinos_ENABLE_ALL_PACKAGES:BOOL=OFF"
|
||||
CMAKE_CONFIGURE="${CMAKE_CONFIGURE} -D Trilinos_ENABLE_EXAMPLES:BOOL=ON"
|
||||
CMAKE_CONFIGURE="${CMAKE_CONFIGURE} -D Trilinos_ENABLE_TESTS:BOOL=ON"
|
||||
CMAKE_CONFIGURE="${CMAKE_CONFIGURE} -D Trilinos_ENABLE_KokkosCore:BOOL=ON"
|
||||
CMAKE_CONFIGURE="${CMAKE_CONFIGURE} -D Trilinos_ENABLE_KokkosContainers:BOOL=ON"
|
||||
CMAKE_CONFIGURE="${CMAKE_CONFIGURE} -D Trilinos_ENABLE_KokkosExample:BOOL=ON"
|
||||
|
||||
#-----------------------------------------------------------------------------
|
||||
#-----------------------------------------------------------------------------
|
||||
# Hardware locality cmake configuration:
|
||||
|
||||
if [ -n "${HWLOC_BASE_DIR}" ] ;
|
||||
then
|
||||
CMAKE_CONFIGURE="${CMAKE_CONFIGURE} -D TPL_ENABLE_HWLOC:BOOL=ON"
|
||||
CMAKE_CONFIGURE="${CMAKE_CONFIGURE} -D HWLOC_INCLUDE_DIRS:FILEPATH=${HWLOC_BASE_DIR}/include"
|
||||
CMAKE_CONFIGURE="${CMAKE_CONFIGURE} -D HWLOC_LIBRARY_DIRS:FILEPATH=${HWLOC_BASE_DIR}/lib"
|
||||
fi
|
||||
|
||||
#-----------------------------------------------------------------------------
|
||||
# Cuda cmake configuration:
|
||||
|
||||
if [ -n "${CUDA_ARCH}" ] ;
|
||||
then
|
||||
|
||||
# Options to CUDA_NVCC_FLAGS must be semi-colon delimited,
|
||||
# this is different than the standard CMAKE_CXX_FLAGS syntax.
|
||||
|
||||
CUDA_NVCC_FLAGS="-gencode;arch=compute_${CUDA_ARCH},code=sm_${CUDA_ARCH}"
|
||||
CUDA_NVCC_FLAGS="${CUDA_NVCC_FLAGS};-Xcompiler;-Wall,-ansi"
|
||||
|
||||
if [ "${CMAKE_BUILD_TYPE}" = "DEBUG" ] ;
|
||||
then
|
||||
CUDA_NVCC_FLAGS="${CUDA_NVCC_FLAGS};-g"
|
||||
else
|
||||
CUDA_NVCC_FLAGS="${CUDA_NVCC_FLAGS};-O3"
|
||||
fi
|
||||
|
||||
CMAKE_CONFIGURE="${CMAKE_CONFIGURE} -D TPL_ENABLE_CUDA:BOOL=ON"
|
||||
CMAKE_CONFIGURE="${CMAKE_CONFIGURE} -D TPL_ENABLE_CUSPARSE:BOOL=ON"
|
||||
CMAKE_CONFIGURE="${CMAKE_CONFIGURE} -D CUDA_VERBOSE_BUILD:BOOL=OFF"
|
||||
CMAKE_CONFIGURE="${CMAKE_CONFIGURE} -D CUDA_NVCC_FLAGS:STRING=${CUDA_NVCC_FLAGS}"
|
||||
|
||||
fi
|
||||
|
||||
#-----------------------------------------------------------------------------
|
||||
|
||||
if [ "${INTEL}" = "ON" -o "${INTEL_XEON_PHI}" = "ON" ] ;
|
||||
then
|
||||
CMAKE_CONFIGURE="${CMAKE_CONFIGURE} -D CMAKE_C_COMPILER=icc"
|
||||
CMAKE_CONFIGURE="${CMAKE_CONFIGURE} -D CMAKE_CXX_COMPILER=icpc"
|
||||
fi
|
||||
|
||||
#-----------------------------------------------------------------------------
|
||||
|
||||
# Cross-compile for Intel Xeon Phi:
|
||||
|
||||
if [ "${INTEL_XEON_PHI}" = "ON" ] ;
|
||||
then
|
||||
|
||||
CMAKE_CONFIGURE="${CMAKE_CONFIGURE} -D CMAKE_SYSTEM_NAME=Linux"
|
||||
CMAKE_CONFIGURE="${CMAKE_CONFIGURE} -D CMAKE_CXX_FLAGS:STRING=-mmic"
|
||||
CMAKE_CONFIGURE="${CMAKE_CONFIGURE} -D CMAKE_C_FLAGS:STRING=-mmic"
|
||||
CMAKE_CONFIGURE="${CMAKE_CONFIGURE} -D CMAKE_Fortran_COMPILER:FILEPATH=ifort"
|
||||
CMAKE_CONFIGURE="${CMAKE_CONFIGURE} -D BLAS_LIBRARY_DIRS:FILEPATH=${MKLROOT}/lib/mic"
|
||||
CMAKE_CONFIGURE="${CMAKE_CONFIGURE} -D BLAS_LIBRARY_NAMES='mkl_intel_lp64;mkl_sequential;mkl_core;pthread;m'"
|
||||
|
||||
CMAKE_CONFIGURE="${CMAKE_CONFIGURE} -D Trilinos_ENABLE_CHECKED_STL:BOOL=OFF"
|
||||
CMAKE_CONFIGURE="${CMAKE_CONFIGURE} -D Trilinos_WARNINGS_AS_ERRORS_FLAGS:STRING=''"
|
||||
CMAKE_CONFIGURE="${CMAKE_CONFIGURE} -D BUILD_SHARED_LIBS:BOOL=OFF"
|
||||
CMAKE_CONFIGURE="${CMAKE_CONFIGURE} -D DART_TESTING_TIMEOUT:STRING=600"
|
||||
|
||||
CMAKE_CONFIGURE="${CMAKE_CONFIGURE} -D LAPACK_LIBRARY_NAMES=''"
|
||||
CMAKE_CONFIGURE="${CMAKE_CONFIGURE} -D TPL_LAPACK_LIBRARIES=''"
|
||||
CMAKE_CONFIGURE="${CMAKE_CONFIGURE} -D TPL_ENABLE_BinUtils=OFF"
|
||||
CMAKE_CONFIGURE="${CMAKE_CONFIGURE} -D TPL_Pthread_LIBRARIES=pthread"
|
||||
|
||||
# Cannot cross-compile fortran compatibility checks on the MIC:
|
||||
CMAKE_CONFIGURE="${CMAKE_CONFIGURE} -D Trilinos_ENABLE_Fortran:BOOL=OFF"
|
||||
|
||||
# Tell cmake the answers to compile-and-execute tests
|
||||
# to prevent cmake from executing a cross-compiled program.
|
||||
CMAKE_CONFIGURE="${CMAKE_CONFIGURE} -D HAVE_GCC_ABI_DEMANGLE_EXITCODE=0"
|
||||
CMAKE_CONFIGURE="${CMAKE_CONFIGURE} -D HAVE_TEUCHOS_BLASFLOAT_EXITCODE=0"
|
||||
CMAKE_CONFIGURE="${CMAKE_CONFIGURE} -D LAPACK_SLAPY2_WORKS_EXITCODE=0"
|
||||
|
||||
fi
|
||||
|
||||
#-----------------------------------------------------------------------------
|
||||
|
||||
CMAKE_CONFIGURE="${CMAKE_CONFIGURE} -D CMAKE_BUILD_TYPE:STRING=${CMAKE_BUILD_TYPE}"
|
||||
CMAKE_CONFIGURE="${CMAKE_CONFIGURE} -D CMAKE_VERBOSE_MAKEFILE:BOOL=${CMAKE_VERBOSE_MAKEFILE}"
|
||||
|
||||
#-----------------------------------------------------------------------------
|
||||
|
||||
echo "cmake ${CMAKE_CONFIGURE} ${CMAKE_PROJECT_DIR}"
|
||||
|
||||
cmake ${CMAKE_CONFIGURE} ${CMAKE_PROJECT_DIR}
|
||||
|
||||
#-----------------------------------------------------------------------------
|
||||
|
|
@ -1,186 +0,0 @@
|
|||
#!/bin/sh
|
||||
#
|
||||
# Copy this script, put it outside the Trilinos source directory, and
|
||||
# build there.
|
||||
#
|
||||
# Additional command-line arguments given to this script will be
|
||||
# passed directly to CMake.
|
||||
#
|
||||
|
||||
#
|
||||
# Force CMake to re-evaluate build options.
|
||||
#
|
||||
rm -rf CMake* Trilinos* packages Dart* Testing cmake_install.cmake MakeFile*
|
||||
|
||||
#-----------------------------------------------------------------------------
|
||||
# Incrementally construct cmake configure options:
|
||||
|
||||
CMAKE_CONFIGURE=""
|
||||
|
||||
#-----------------------------------------------------------------------------
|
||||
# Location of Trilinos source tree:
|
||||
|
||||
CMAKE_PROJECT_DIR="${HOME}/Trilinos"
|
||||
|
||||
# Location for installation:
|
||||
|
||||
CMAKE_CONFIGURE="${CMAKE_CONFIGURE} -D CMAKE_INSTALL_PREFIX=/home/projects/kokkos/mic/`date +%F`"
|
||||
|
||||
#-----------------------------------------------------------------------------
|
||||
# General build options.
|
||||
# Use a variable so options can be propagated to CUDA compiler.
|
||||
|
||||
CMAKE_VERBOSE_MAKEFILE=OFF
|
||||
CMAKE_BUILD_TYPE=RELEASE
|
||||
# CMAKE_BUILD_TYPE=DEBUG
|
||||
|
||||
#-----------------------------------------------------------------------------
|
||||
# Build for CUDA architecture:
|
||||
|
||||
CUDA_ARCH=""
|
||||
# CUDA_ARCH="20"
|
||||
# CUDA_ARCH="30"
|
||||
# CUDA_ARCH="35"
|
||||
|
||||
# Build for MIC architecture:
|
||||
|
||||
INTEL_XEON_PHI=ON
|
||||
|
||||
# Build with HWLOC at location:
|
||||
|
||||
HWLOC_BASE_DIR="/home/projects/libraries/mic/hwloc/1.6.2"
|
||||
|
||||
# Location for MPI to use in examples:
|
||||
|
||||
MPI_BASE_DIR=""
|
||||
|
||||
#-----------------------------------------------------------------------------
|
||||
# MPI configuation only used for examples:
|
||||
#
|
||||
# Must have the MPI_BASE_DIR so that the
|
||||
# include path can be passed to the Cuda compiler
|
||||
|
||||
if [ -n "${MPI_BASE_DIR}" ] ;
|
||||
then
|
||||
CMAKE_CONFIGURE="${CMAKE_CONFIGURE} -D TPL_ENABLE_MPI:BOOL=ON"
|
||||
CMAKE_CONFIGURE="${CMAKE_CONFIGURE} -D MPI_BASE_DIR:PATH=${MPI_BASE_DIR}"
|
||||
else
|
||||
CMAKE_CONFIGURE="${CMAKE_CONFIGURE} -D TPL_ENABLE_MPI:BOOL=OFF"
|
||||
fi
|
||||
|
||||
#-----------------------------------------------------------------------------
|
||||
# Pthread configuation:
|
||||
|
||||
CMAKE_CONFIGURE="${CMAKE_CONFIGURE} -D TPL_ENABLE_Pthread:BOOL=ON"
|
||||
# CMAKE_CONFIGURE="${CMAKE_CONFIGURE} -D TPL_ENABLE_Pthread:BOOL=OFF"
|
||||
|
||||
#-----------------------------------------------------------------------------
|
||||
# OpenMP configuation:
|
||||
|
||||
# CMAKE_CONFIGURE="${CMAKE_CONFIGURE} -D Trilinos_ENABLE_OpenMP:BOOL=ON"
|
||||
CMAKE_CONFIGURE="${CMAKE_CONFIGURE} -D Trilinos_ENABLE_OpenMP:BOOL=OFF"
|
||||
|
||||
#-----------------------------------------------------------------------------
|
||||
#-----------------------------------------------------------------------------
|
||||
# Configure packages for kokkos-only:
|
||||
|
||||
CMAKE_CONFIGURE="${CMAKE_CONFIGURE} -D Trilinos_ENABLE_ALL_PACKAGES:BOOL=OFF"
|
||||
CMAKE_CONFIGURE="${CMAKE_CONFIGURE} -D Trilinos_ENABLE_EXAMPLES:BOOL=ON"
|
||||
CMAKE_CONFIGURE="${CMAKE_CONFIGURE} -D Trilinos_ENABLE_TESTS:BOOL=ON"
|
||||
CMAKE_CONFIGURE="${CMAKE_CONFIGURE} -D Trilinos_ENABLE_KokkosCore:BOOL=ON"
|
||||
CMAKE_CONFIGURE="${CMAKE_CONFIGURE} -D Trilinos_ENABLE_KokkosContainers:BOOL=ON"
|
||||
CMAKE_CONFIGURE="${CMAKE_CONFIGURE} -D Trilinos_ENABLE_KokkosExample:BOOL=ON"
|
||||
|
||||
#-----------------------------------------------------------------------------
|
||||
#-----------------------------------------------------------------------------
|
||||
# Hardware locality cmake configuration:
|
||||
|
||||
if [ -n "${HWLOC_BASE_DIR}" ] ;
|
||||
then
|
||||
CMAKE_CONFIGURE="${CMAKE_CONFIGURE} -D TPL_ENABLE_HWLOC:BOOL=ON"
|
||||
CMAKE_CONFIGURE="${CMAKE_CONFIGURE} -D HWLOC_INCLUDE_DIRS:FILEPATH=${HWLOC_BASE_DIR}/include"
|
||||
CMAKE_CONFIGURE="${CMAKE_CONFIGURE} -D HWLOC_LIBRARY_DIRS:FILEPATH=${HWLOC_BASE_DIR}/lib"
|
||||
fi
|
||||
|
||||
#-----------------------------------------------------------------------------
|
||||
# Cuda cmake configuration:
|
||||
|
||||
if [ -n "${CUDA_ARCH}" ] ;
|
||||
then
|
||||
|
||||
# Options to CUDA_NVCC_FLAGS must be semi-colon delimited,
|
||||
# this is different than the standard CMAKE_CXX_FLAGS syntax.
|
||||
|
||||
CUDA_NVCC_FLAGS="-gencode;arch=compute_${CUDA_ARCH},code=sm_${CUDA_ARCH}"
|
||||
CUDA_NVCC_FLAGS="${CUDA_NVCC_FLAGS};-Xcompiler;-Wall,-ansi"
|
||||
|
||||
if [ "${CMAKE_BUILD_TYPE}" = "DEBUG" ] ;
|
||||
then
|
||||
CUDA_NVCC_FLAGS="${CUDA_NVCC_FLAGS};-g"
|
||||
else
|
||||
CUDA_NVCC_FLAGS="${CUDA_NVCC_FLAGS};-O3"
|
||||
fi
|
||||
|
||||
CMAKE_CONFIGURE="${CMAKE_CONFIGURE} -D TPL_ENABLE_CUDA:BOOL=ON"
|
||||
CMAKE_CONFIGURE="${CMAKE_CONFIGURE} -D TPL_ENABLE_CUSPARSE:BOOL=ON"
|
||||
CMAKE_CONFIGURE="${CMAKE_CONFIGURE} -D CUDA_VERBOSE_BUILD:BOOL=OFF"
|
||||
CMAKE_CONFIGURE="${CMAKE_CONFIGURE} -D CUDA_NVCC_FLAGS:STRING=${CUDA_NVCC_FLAGS}"
|
||||
|
||||
fi
|
||||
|
||||
#-----------------------------------------------------------------------------
|
||||
|
||||
if [ "${INTEL}" = "ON" -o "${INTEL_XEON_PHI}" = "ON" ] ;
|
||||
then
|
||||
CMAKE_CONFIGURE="${CMAKE_CONFIGURE} -D CMAKE_C_COMPILER=icc"
|
||||
CMAKE_CONFIGURE="${CMAKE_CONFIGURE} -D CMAKE_CXX_COMPILER=icpc"
|
||||
fi
|
||||
|
||||
#-----------------------------------------------------------------------------
|
||||
|
||||
# Cross-compile for Intel Xeon Phi:
|
||||
|
||||
if [ "${INTEL_XEON_PHI}" = "ON" ] ;
|
||||
then
|
||||
|
||||
CMAKE_CONFIGURE="${CMAKE_CONFIGURE} -D CMAKE_SYSTEM_NAME=Linux"
|
||||
CMAKE_CONFIGURE="${CMAKE_CONFIGURE} -D CMAKE_CXX_FLAGS:STRING=-mmic"
|
||||
CMAKE_CONFIGURE="${CMAKE_CONFIGURE} -D CMAKE_C_FLAGS:STRING=-mmic"
|
||||
CMAKE_CONFIGURE="${CMAKE_CONFIGURE} -D CMAKE_Fortran_COMPILER:FILEPATH=ifort"
|
||||
CMAKE_CONFIGURE="${CMAKE_CONFIGURE} -D BLAS_LIBRARY_DIRS:FILEPATH=${MKLROOT}/lib/mic"
|
||||
CMAKE_CONFIGURE="${CMAKE_CONFIGURE} -D BLAS_LIBRARY_NAMES='mkl_intel_lp64;mkl_sequential;mkl_core;pthread;m'"
|
||||
|
||||
CMAKE_CONFIGURE="${CMAKE_CONFIGURE} -D Trilinos_ENABLE_CHECKED_STL:BOOL=OFF"
|
||||
CMAKE_CONFIGURE="${CMAKE_CONFIGURE} -D Trilinos_WARNINGS_AS_ERRORS_FLAGS:STRING=''"
|
||||
CMAKE_CONFIGURE="${CMAKE_CONFIGURE} -D BUILD_SHARED_LIBS:BOOL=OFF"
|
||||
CMAKE_CONFIGURE="${CMAKE_CONFIGURE} -D DART_TESTING_TIMEOUT:STRING=600"
|
||||
|
||||
CMAKE_CONFIGURE="${CMAKE_CONFIGURE} -D LAPACK_LIBRARY_NAMES=''"
|
||||
CMAKE_CONFIGURE="${CMAKE_CONFIGURE} -D TPL_LAPACK_LIBRARIES=''"
|
||||
CMAKE_CONFIGURE="${CMAKE_CONFIGURE} -D TPL_ENABLE_BinUtils=OFF"
|
||||
CMAKE_CONFIGURE="${CMAKE_CONFIGURE} -D TPL_Pthread_LIBRARIES=pthread"
|
||||
|
||||
# Cannot cross-compile fortran compatibility checks on the MIC:
|
||||
CMAKE_CONFIGURE="${CMAKE_CONFIGURE} -D Trilinos_ENABLE_Fortran:BOOL=OFF"
|
||||
|
||||
# Tell cmake the answers to compile-and-execute tests
|
||||
# to prevent cmake from executing a cross-compiled program.
|
||||
CMAKE_CONFIGURE="${CMAKE_CONFIGURE} -D HAVE_GCC_ABI_DEMANGLE_EXITCODE=0"
|
||||
CMAKE_CONFIGURE="${CMAKE_CONFIGURE} -D HAVE_TEUCHOS_BLASFLOAT_EXITCODE=0"
|
||||
CMAKE_CONFIGURE="${CMAKE_CONFIGURE} -D LAPACK_SLAPY2_WORKS_EXITCODE=0"
|
||||
|
||||
fi
|
||||
|
||||
#-----------------------------------------------------------------------------
|
||||
|
||||
CMAKE_CONFIGURE="${CMAKE_CONFIGURE} -D CMAKE_BUILD_TYPE:STRING=${CMAKE_BUILD_TYPE}"
|
||||
CMAKE_CONFIGURE="${CMAKE_CONFIGURE} -D CMAKE_VERBOSE_MAKEFILE:BOOL=${CMAKE_VERBOSE_MAKEFILE}"
|
||||
|
||||
#-----------------------------------------------------------------------------
|
||||
|
||||
echo "cmake ${CMAKE_CONFIGURE} ${CMAKE_PROJECT_DIR}"
|
||||
|
||||
cmake ${CMAKE_CONFIGURE} ${CMAKE_PROJECT_DIR}
|
||||
|
||||
#-----------------------------------------------------------------------------
|
||||
|
|
@ -1,293 +0,0 @@
|
|||
#!/bin/sh
|
||||
#
|
||||
# Copy this script, put it outside the Trilinos source directory, and
|
||||
# build there.
|
||||
#
|
||||
#-----------------------------------------------------------------------------
|
||||
# General build options.
|
||||
# Use a variable so options can be propagated to CUDA compiler.
|
||||
|
||||
CMAKE_BUILD_TYPE=RELEASE
|
||||
# CMAKE_BUILD_TYPE=DEBUG
|
||||
|
||||
# Source and installation directories:
|
||||
|
||||
TRILINOS_SOURCE_DIR=${HOME}/Trilinos
|
||||
TRILINOS_INSTALL_DIR=${HOME}/TrilinosInstall/`date +%F`
|
||||
|
||||
#-----------------------------------------------------------------------------
|
||||
|
||||
USE_CUDA_ARCH=
|
||||
USE_THREAD=
|
||||
USE_OPENMP=
|
||||
USE_INTEL=
|
||||
USE_XEON_PHI=
|
||||
HWLOC_BASE_DIR=
|
||||
MPI_BASE_DIR=
|
||||
BLAS_LIB_DIR=
|
||||
LAPACK_LIB_DIR=
|
||||
|
||||
if [ 1 ] ; then
|
||||
# Platform 'kokkos-dev' with Cuda, OpenMP, hwloc, mpi, gnu
|
||||
USE_CUDA_ARCH="35"
|
||||
USE_OPENMP=ON
|
||||
HWLOC_BASE_DIR="/home/projects/hwloc/1.7.1/host/gnu/4.4.7"
|
||||
MPI_BASE_DIR="/home/projects/mvapich/2.0.0b/gnu/4.4.7"
|
||||
BLAS_LIB_DIR="/home/projects/blas/host/gnu/lib"
|
||||
LAPACK_LIB_DIR="/home/projects/lapack/host/gnu/lib"
|
||||
|
||||
elif [ ] ; then
|
||||
# Platform 'kokkos-dev' with Cuda, Threads, hwloc, mpi, gnu
|
||||
USE_CUDA_ARCH="35"
|
||||
USE_THREAD=ON
|
||||
HWLOC_BASE_DIR="/home/projects/hwloc/1.7.1/host/gnu/4.4.7"
|
||||
MPI_BASE_DIR="/home/projects/mvapich/2.0.0b/gnu/4.4.7"
|
||||
BLAS_LIB_DIR="/home/projects/blas/host/gnu/lib"
|
||||
LAPACK_LIB_DIR="/home/projects/lapack/host/gnu/lib"
|
||||
|
||||
elif [ ] ; then
|
||||
# Platform 'kokkos-dev' with Xeon Phi and hwloc
|
||||
USE_OPENMP=ON
|
||||
USE_INTEL=ON
|
||||
USE_XEON_PHI=ON
|
||||
HWLOC_BASE_DIR="/home/projects/hwloc/1.7.1/mic/intel/13.SP1.1.106"
|
||||
|
||||
elif [ ] ; then
|
||||
# Platform 'kokkos-nvidia' with Cuda, OpenMP, hwloc, mpi, gnu
|
||||
USE_CUDA_ARCH="20"
|
||||
USE_OPENMP=ON
|
||||
HWLOC_BASE_DIR="/home/sems/common/hwloc/current"
|
||||
MPI_BASE_DIR="/home/sems/common/openmpi/current"
|
||||
|
||||
elif [ ] ; then
|
||||
# Platform 'kokkos-nvidia' with Cuda, Threads, hwloc, mpi, gnu
|
||||
USE_CUDA_ARCH="20"
|
||||
USE_THREAD=ON
|
||||
HWLOC_BASE_DIR="/home/sems/common/hwloc/current"
|
||||
MPI_BASE_DIR="/home/sems/common/openmpi/current"
|
||||
|
||||
fi
|
||||
|
||||
#-----------------------------------------------------------------------------
|
||||
# Incrementally construct cmake configure command line options:
|
||||
|
||||
CMAKE_CONFIGURE=""
|
||||
CMAKE_CXX_FLAGS=""
|
||||
|
||||
#-----------------------------------------------------------------------------
|
||||
# Configure for Kokkos subpackages and tests:
|
||||
|
||||
CMAKE_CONFIGURE="${CMAKE_CONFIGURE} -D Trilinos_ENABLE_ALL_PACKAGES:BOOL=OFF"
|
||||
CMAKE_CONFIGURE="${CMAKE_CONFIGURE} -D Trilinos_ENABLE_EXAMPLES:BOOL=ON"
|
||||
CMAKE_CONFIGURE="${CMAKE_CONFIGURE} -D Trilinos_ENABLE_TESTS:BOOL=ON"
|
||||
CMAKE_CONFIGURE="${CMAKE_CONFIGURE} -D Trilinos_ENABLE_KokkosCore:BOOL=ON"
|
||||
CMAKE_CONFIGURE="${CMAKE_CONFIGURE} -D Trilinos_ENABLE_KokkosContainers:BOOL=ON"
|
||||
CMAKE_CONFIGURE="${CMAKE_CONFIGURE} -D Trilinos_ENABLE_TpetraKernels:BOOL=ON"
|
||||
CMAKE_CONFIGURE="${CMAKE_CONFIGURE} -D Trilinos_ENABLE_KokkosExample:BOOL=ON"
|
||||
|
||||
#-----------------------------------------------------------------------------
|
||||
|
||||
if [ 1 ] ; then
|
||||
|
||||
# Configure for Tpetra/Kokkos:
|
||||
|
||||
CMAKE_CONFIGURE="${CMAKE_CONFIGURE} -D BLAS_LIBRARY_DIRS:FILEPATH=${BLAS_LIB_DIR}"
|
||||
CMAKE_CONFIGURE="${CMAKE_CONFIGURE} -D LAPACK_LIBRARY_DIRS:FILEPATH=${LAPACK_LIB_DIR}"
|
||||
|
||||
CMAKE_CONFIGURE="${CMAKE_CONFIGURE} -D Trilinos_ENABLE_Tpetra:BOOL=ON"
|
||||
CMAKE_CONFIGURE="${CMAKE_CONFIGURE} -D Trilinos_ENABLE_Kokkos:BOOL=ON"
|
||||
CMAKE_CONFIGURE="${CMAKE_CONFIGURE} -D Trilinos_ENABLE_TpetraClassic:BOOL=ON"
|
||||
CMAKE_CONFIGURE="${CMAKE_CONFIGURE} -D Trilinos_ENABLE_TeuchosKokkosCompat:BOOL=ON"
|
||||
CMAKE_CONFIGURE="${CMAKE_CONFIGURE} -D Trilinos_ENABLE_TeuchosKokkosComm:BOOL=ON"
|
||||
CMAKE_CONFIGURE="${CMAKE_CONFIGURE} -D Tpetra_ENABLE_Kokkos_Refactor:BOOL=ON"
|
||||
CMAKE_CONFIGURE="${CMAKE_CONFIGURE} -D KokkosClassic_DefaultNode:STRING=Kokkos::Compat::KokkosOpenMPWrapperNode"
|
||||
|
||||
CMAKE_CXX_FLAGS="${CMAKE_CXX_FLAGS}-DKOKKOS_FAST_COMPILE"
|
||||
|
||||
if [ -n "${USE_CUDA_ARCH}" ] ; then
|
||||
|
||||
CMAKE_CONFIGURE="${CMAKE_CONFIGURE} -D Kokkos_ENABLE_Cuda:BOOL=ON"
|
||||
|
||||
fi
|
||||
|
||||
fi
|
||||
|
||||
if [ 1 ] ; then
|
||||
|
||||
# Configure for Stokhos:
|
||||
|
||||
CMAKE_CONFIGURE="${CMAKE_CONFIGURE} -D Trilinos_ENABLE_Sacado:BOOL=ON"
|
||||
CMAKE_CONFIGURE="${CMAKE_CONFIGURE} -D Trilinos_ENABLE_Stokhos:BOOL=ON"
|
||||
CMAKE_CONFIGURE="${CMAKE_CONFIGURE} -D Stokhos_ENABLE_Belos:BOOL=ON"
|
||||
|
||||
fi
|
||||
|
||||
if [ 1 ] ; then
|
||||
|
||||
# Configure for TrilinosCouplings:
|
||||
|
||||
CMAKE_CONFIGURE="${CMAKE_CONFIGURE} -D Trilinos_ENABLE_TrilinosCouplings:BOOL=ON"
|
||||
|
||||
fi
|
||||
|
||||
#-----------------------------------------------------------------------------
|
||||
#-----------------------------------------------------------------------------
|
||||
|
||||
CMAKE_CONFIGURE="${CMAKE_CONFIGURE} -D CMAKE_BUILD_TYPE:STRING=${CMAKE_BUILD_TYPE}"
|
||||
|
||||
# CMAKE_CONFIGURE="${CMAKE_CONFIGURE} -D CMAKE_VERBOSE_MAKEFILE:BOOL=ON"
|
||||
|
||||
if [ "${CMAKE_BUILD_TYPE}" == "DEBUG" ] ;
|
||||
then
|
||||
CMAKE_CONFIGURE="${CMAKE_CONFIGURE} -D Kokkos_ENABLE_BOUNDS_CHECK:BOOL=ON"
|
||||
fi
|
||||
|
||||
#-----------------------------------------------------------------------------
|
||||
# Location for installation:
|
||||
|
||||
CMAKE_CONFIGURE="${CMAKE_CONFIGURE} -D CMAKE_INSTALL_PREFIX=${TRILINOS_INSTALL_DIR}"
|
||||
|
||||
#-----------------------------------------------------------------------------
|
||||
# MPI configuation only used for examples:
|
||||
#
|
||||
# Must have the MPI_BASE_DIR so that the
|
||||
# include path can be passed to the Cuda compiler
|
||||
|
||||
if [ -n "${MPI_BASE_DIR}" ] ;
|
||||
then
|
||||
CMAKE_CONFIGURE="${CMAKE_CONFIGURE} -D TPL_ENABLE_MPI:BOOL=ON"
|
||||
CMAKE_CONFIGURE="${CMAKE_CONFIGURE} -D MPI_BASE_DIR:PATH=${MPI_BASE_DIR}"
|
||||
else
|
||||
CMAKE_CONFIGURE="${CMAKE_CONFIGURE} -D TPL_ENABLE_MPI:BOOL=OFF"
|
||||
fi
|
||||
|
||||
#-----------------------------------------------------------------------------
|
||||
# Kokkos use pthread configuation:
|
||||
|
||||
if [ "${USE_THREAD}" = "ON" ] ;
|
||||
then
|
||||
CMAKE_CONFIGURE="${CMAKE_CONFIGURE} -D TPL_ENABLE_Pthread:BOOL=ON"
|
||||
CMAKE_CONFIGURE="${CMAKE_CONFIGURE} -D Kokkos_ENABLE_Pthread:BOOL=ON"
|
||||
else
|
||||
CMAKE_CONFIGURE="${CMAKE_CONFIGURE} -D Kokkos_ENABLE_Pthread:BOOL=OFF"
|
||||
fi
|
||||
|
||||
#-----------------------------------------------------------------------------
|
||||
# Kokkos use OpenMP configuation:
|
||||
|
||||
if [ "${USE_OPENMP}" = "ON" ] ;
|
||||
then
|
||||
CMAKE_CONFIGURE="${CMAKE_CONFIGURE} -D Trilinos_ENABLE_OpenMP:BOOL=ON"
|
||||
CMAKE_CONFIGURE="${CMAKE_CONFIGURE} -D Kokkos_ENABLE_OpenMP:BOOL=ON"
|
||||
else
|
||||
CMAKE_CONFIGURE="${CMAKE_CONFIGURE} -D Kokkos_ENABLE_OpenMP:BOOL=OFF"
|
||||
fi
|
||||
|
||||
#-----------------------------------------------------------------------------
|
||||
# Hardware locality configuration:
|
||||
|
||||
if [ -n "${HWLOC_BASE_DIR}" ] ;
|
||||
then
|
||||
CMAKE_CONFIGURE="${CMAKE_CONFIGURE} -D TPL_ENABLE_HWLOC:BOOL=ON"
|
||||
CMAKE_CONFIGURE="${CMAKE_CONFIGURE} -D HWLOC_INCLUDE_DIRS:FILEPATH=${HWLOC_BASE_DIR}/include"
|
||||
CMAKE_CONFIGURE="${CMAKE_CONFIGURE} -D HWLOC_LIBRARY_DIRS:FILEPATH=${HWLOC_BASE_DIR}/lib"
|
||||
fi
|
||||
|
||||
#-----------------------------------------------------------------------------
|
||||
# Cuda cmake configuration:
|
||||
|
||||
if [ -n "${USE_CUDA_ARCH}" ] ;
|
||||
then
|
||||
|
||||
# Options to CUDA_NVCC_FLAGS must be semi-colon delimited,
|
||||
# this is different than the standard CMAKE_CXX_FLAGS syntax.
|
||||
|
||||
CUDA_NVCC_FLAGS="-DKOKKOS_HAVE_CUDA_ARCH=${USE_CUDA_ARCH}0;-gencode;arch=compute_${USE_CUDA_ARCH},code=sm_${USE_CUDA_ARCH}"
|
||||
|
||||
if [ "${USE_OPENMP}" = "ON" ] ;
|
||||
then
|
||||
CUDA_NVCC_FLAGS="${CUDA_NVCC_FLAGS};-Xcompiler;-Wall,-ansi,-fopenmp"
|
||||
else
|
||||
CUDA_NVCC_FLAGS="${CUDA_NVCC_FLAGS};-Xcompiler;-Wall,-ansi"
|
||||
fi
|
||||
|
||||
if [ "${CMAKE_BUILD_TYPE}" = "DEBUG" ] ;
|
||||
then
|
||||
CUDA_NVCC_FLAGS="${CUDA_NVCC_FLAGS};-g"
|
||||
else
|
||||
CUDA_NVCC_FLAGS="${CUDA_NVCC_FLAGS};-O3"
|
||||
fi
|
||||
|
||||
CMAKE_CONFIGURE="${CMAKE_CONFIGURE} -D TPL_ENABLE_CUDA:BOOL=ON"
|
||||
CMAKE_CONFIGURE="${CMAKE_CONFIGURE} -D TPL_ENABLE_CUSPARSE:BOOL=ON"
|
||||
CMAKE_CONFIGURE="${CMAKE_CONFIGURE} -D CUDA_VERBOSE_BUILD:BOOL=OFF"
|
||||
CMAKE_CONFIGURE="${CMAKE_CONFIGURE} -D CUDA_NVCC_FLAGS:STRING=${CUDA_NVCC_FLAGS}"
|
||||
|
||||
fi
|
||||
|
||||
#-----------------------------------------------------------------------------
|
||||
|
||||
if [ "${USE_INTEL}" = "ON" -o "${USE_XEON_PHI}" = "ON" ] ;
|
||||
then
|
||||
CMAKE_CONFIGURE="${CMAKE_CONFIGURE} -D CMAKE_C_COMPILER=icc"
|
||||
CMAKE_CONFIGURE="${CMAKE_CONFIGURE} -D CMAKE_CXX_COMPILER=icpc"
|
||||
fi
|
||||
|
||||
# Cross-compile for Intel Xeon Phi:
|
||||
|
||||
if [ "${USE_XEON_PHI}" = "ON" ] ;
|
||||
then
|
||||
|
||||
CMAKE_CXX_FLAGS="${CMAKE_CXX_FLAGS} -mmic"
|
||||
|
||||
CMAKE_CONFIGURE="${CMAKE_CONFIGURE} -D CMAKE_SYSTEM_NAME=Linux"
|
||||
CMAKE_CONFIGURE="${CMAKE_CONFIGURE} -D CMAKE_C_FLAGS:STRING=-mmic"
|
||||
CMAKE_CONFIGURE="${CMAKE_CONFIGURE} -D CMAKE_Fortran_COMPILER:FILEPATH=ifort"
|
||||
CMAKE_CONFIGURE="${CMAKE_CONFIGURE} -D BLAS_LIBRARY_DIRS:FILEPATH=${MKLROOT}/lib/mic"
|
||||
CMAKE_CONFIGURE="${CMAKE_CONFIGURE} -D BLAS_LIBRARY_NAMES='mkl_intel_lp64;mkl_sequential;mkl_core;pthread;m'"
|
||||
|
||||
CMAKE_CONFIGURE="${CMAKE_CONFIGURE} -D Trilinos_ENABLE_CHECKED_STL:BOOL=OFF"
|
||||
CMAKE_CONFIGURE="${CMAKE_CONFIGURE} -D Trilinos_WARNINGS_AS_ERRORS_FLAGS:STRING=''"
|
||||
CMAKE_CONFIGURE="${CMAKE_CONFIGURE} -D BUILD_SHARED_LIBS:BOOL=OFF"
|
||||
CMAKE_CONFIGURE="${CMAKE_CONFIGURE} -D DART_TESTING_TIMEOUT:STRING=600"
|
||||
|
||||
CMAKE_CONFIGURE="${CMAKE_CONFIGURE} -D LAPACK_LIBRARY_NAMES=''"
|
||||
CMAKE_CONFIGURE="${CMAKE_CONFIGURE} -D TPL_LAPACK_LIBRARIES=''"
|
||||
CMAKE_CONFIGURE="${CMAKE_CONFIGURE} -D TPL_ENABLE_BinUtils=OFF"
|
||||
CMAKE_CONFIGURE="${CMAKE_CONFIGURE} -D TPL_Pthread_LIBRARIES=pthread"
|
||||
|
||||
# Cannot cross-compile fortran compatibility checks on the MIC:
|
||||
CMAKE_CONFIGURE="${CMAKE_CONFIGURE} -D Trilinos_ENABLE_Fortran:BOOL=OFF"
|
||||
|
||||
# Tell cmake the answers to compile-and-execute tests
|
||||
# to prevent cmake from executing a cross-compiled program.
|
||||
CMAKE_CONFIGURE="${CMAKE_CONFIGURE} -D HAVE_GCC_ABI_DEMANGLE_EXITCODE=0"
|
||||
CMAKE_CONFIGURE="${CMAKE_CONFIGURE} -D HAVE_TEUCHOS_BLASFLOAT_EXITCODE=0"
|
||||
CMAKE_CONFIGURE="${CMAKE_CONFIGURE} -D LAPACK_SLAPY2_WORKS_EXITCODE=0"
|
||||
|
||||
fi
|
||||
|
||||
#-----------------------------------------------------------------------------
|
||||
#-----------------------------------------------------------------------------
|
||||
|
||||
if [ -n "${CMAKE_CXX_FLAGS}" ] ; then
|
||||
|
||||
CMAKE_CONFIGURE="${CMAKE_CONFIGURE} -D CMAKE_CXX_FLAGS:STRING='${CMAKE_CXX_FLAGS}'"
|
||||
|
||||
fi
|
||||
|
||||
#-----------------------------------------------------------------------------
|
||||
#
|
||||
# Remove CMake output files to force reconfigure from scratch.
|
||||
#
|
||||
|
||||
rm -rf CMake* Trilinos* packages Dart* Testing cmake_install.cmake MakeFile*
|
||||
|
||||
#
|
||||
|
||||
echo "cmake ${CMAKE_CONFIGURE} ${TRILINOS_SOURCE_DIR}"
|
||||
|
||||
cmake ${CMAKE_CONFIGURE} ${TRILINOS_SOURCE_DIR}
|
||||
|
||||
#-----------------------------------------------------------------------------
|
||||
|
|
@ -1,88 +0,0 @@
|
|||
#!/bin/sh
|
||||
#
|
||||
# Copy this script, put it outside the Trilinos source directory, and
|
||||
# build there.
|
||||
#
|
||||
# Additional command-line arguments given to this script will be
|
||||
# passed directly to CMake.
|
||||
#
|
||||
|
||||
# to build:
|
||||
# build on bgq-b[1-12]
|
||||
# module load sierra-devel
|
||||
# run this configure file
|
||||
# make
|
||||
|
||||
# to run:
|
||||
# ssh bgq-login
|
||||
# cd /scratch/username/...
|
||||
# export OMP_PROC_BIND and XLSMPOPTS environment variables
|
||||
# run with srun
|
||||
|
||||
# Note: hwloc does not work to get or set cpubindings on bgq.
|
||||
# Use the openmp backend and the openmp environment variables.
|
||||
#
|
||||
# Only the mpi wrappers seem to be setup for cross-compile,
|
||||
# so it is important that this configure enables MPI and uses mpigcc wrappers.
|
||||
|
||||
|
||||
|
||||
#
|
||||
# Force CMake to re-evaluate build options.
|
||||
#
|
||||
rm -rf CMake* Trilinos* packages Dart* Testing cmake_install.cmake MakeFile*
|
||||
|
||||
#-----------------------------------------------------------------------------
|
||||
# Incrementally construct cmake configure options:
|
||||
|
||||
CMAKE_CONFIGURE=""
|
||||
|
||||
#-----------------------------------------------------------------------------
|
||||
# Location of Trilinos source tree:
|
||||
|
||||
CMAKE_PROJECT_DIR="../Trilinos"
|
||||
|
||||
# Location for installation:
|
||||
|
||||
CMAKE_CONFIGURE="${CMAKE_CONFIGURE} -D CMAKE_INSTALL_PREFIX=../TrilinosInstall/`date +%F`"
|
||||
|
||||
#-----------------------------------------------------------------------------
|
||||
# General build options.
|
||||
# Use a variable so options can be propagated to CUDA compiler.
|
||||
|
||||
CMAKE_CONFIGURE="${CMAKE_CONFIGURE} -D CMAKE_C_COMPILER=mpigcc-4.7.2"
|
||||
CMAKE_CONFIGURE="${CMAKE_CONFIGURE} -D CMAKE_CXX_COMPILER=mpig++-4.7.2"
|
||||
|
||||
CMAKE_VERBOSE_MAKEFILE=OFF
|
||||
CMAKE_BUILD_TYPE=RELEASE
|
||||
# CMAKE_BUILD_TYPE=DEBUG
|
||||
|
||||
CMAKE_CONFIGURE="${CMAKE_CONFIGURE} -D TPL_ENABLE_MPI:BOOL=ON"
|
||||
CMAKE_CONFIGURE="${CMAKE_CONFIGURE} -D Trilinos_ENABLE_OpenMP:BOOL=ON"
|
||||
|
||||
#-----------------------------------------------------------------------------
|
||||
#-----------------------------------------------------------------------------
|
||||
# Configure packages for kokkos-only:
|
||||
|
||||
CMAKE_CONFIGURE="${CMAKE_CONFIGURE} -D Trilinos_ENABLE_ALL_PACKAGES:BOOL=OFF"
|
||||
CMAKE_CONFIGURE="${CMAKE_CONFIGURE} -D Trilinos_ENABLE_EXAMPLES:BOOL=ON"
|
||||
CMAKE_CONFIGURE="${CMAKE_CONFIGURE} -D Trilinos_ENABLE_TESTS:BOOL=ON"
|
||||
CMAKE_CONFIGURE="${CMAKE_CONFIGURE} -D Trilinos_ENABLE_KokkosCore:BOOL=ON"
|
||||
CMAKE_CONFIGURE="${CMAKE_CONFIGURE} -D Trilinos_ENABLE_KokkosContainers:BOOL=ON"
|
||||
CMAKE_CONFIGURE="${CMAKE_CONFIGURE} -D Trilinos_ENABLE_TpetraKernels:BOOL=ON"
|
||||
CMAKE_CONFIGURE="${CMAKE_CONFIGURE} -D Trilinos_ENABLE_KokkosExample:BOOL=ON"
|
||||
CMAKE_CONFIGURE="${CMAKE_CONFIGURE} -D Kokkos_ENABLE_Pthread:BOOL=OFF"
|
||||
|
||||
#-----------------------------------------------------------------------------
|
||||
|
||||
CMAKE_CONFIGURE="${CMAKE_CONFIGURE} -D CMAKE_BUILD_TYPE:STRING=${CMAKE_BUILD_TYPE}"
|
||||
CMAKE_CONFIGURE="${CMAKE_CONFIGURE} -D CMAKE_VERBOSE_MAKEFILE:BOOL=${CMAKE_VERBOSE_MAKEFILE}"
|
||||
|
||||
#-----------------------------------------------------------------------------
|
||||
|
||||
echo "cmake ${CMAKE_CONFIGURE} ${CMAKE_PROJECT_DIR}"
|
||||
|
||||
cmake ${CMAKE_CONFIGURE} ${CMAKE_PROJECT_DIR}
|
||||
|
||||
#-----------------------------------------------------------------------------
|
||||
|
|
@ -1,216 +0,0 @@
|
|||
#!/bin/sh
|
||||
#
|
||||
# Copy this script, put it outside the Trilinos source directory, and
|
||||
# build there.
|
||||
#
|
||||
# Additional command-line arguments given to this script will be
|
||||
# passed directly to CMake.
|
||||
#
|
||||
|
||||
#
|
||||
# Force CMake to re-evaluate build options.
|
||||
#
|
||||
rm -rf CMake* Trilinos* packages Dart* Testing cmake_install.cmake MakeFile*
|
||||
|
||||
#-----------------------------------------------------------------------------
|
||||
# Incrementally construct cmake configure options:
|
||||
|
||||
CMAKE_CONFIGURE=""
|
||||
|
||||
#-----------------------------------------------------------------------------
|
||||
# Location of Trilinos source tree:
|
||||
|
||||
CMAKE_PROJECT_DIR="${HOME}/Trilinos"
|
||||
|
||||
# Location for installation:
|
||||
|
||||
CMAKE_CONFIGURE="${CMAKE_CONFIGURE} -D CMAKE_INSTALL_PREFIX=${HOME}/TrilinosInstall/`date +%F`"
|
||||
|
||||
#-----------------------------------------------------------------------------
|
||||
# General build options.
|
||||
# Use a variable so options can be propagated to CUDA compiler.
|
||||
|
||||
CMAKE_VERBOSE_MAKEFILE=OFF
|
||||
CMAKE_BUILD_TYPE=RELEASE
|
||||
#CMAKE_BUILD_TYPE=DEBUG
|
||||
#CMAKE_CONFIGURE="${CMAKE_CONFIGURE} -D Kokkos_ENABLE_BOUNDS_CHECK:BOOL=ON"
|
||||
|
||||
#-----------------------------------------------------------------------------
|
||||
# Build for CUDA architecture:
|
||||
|
||||
#CUDA_ARCH=""
|
||||
#CUDA_ARCH="20"
|
||||
#CUDA_ARCH="30"
|
||||
CUDA_ARCH="35"
|
||||
|
||||
# Build with OpenMP
|
||||
|
||||
OPENMP=ON
|
||||
PTHREADS=ON
|
||||
|
||||
# Build host code with Intel compiler:
|
||||
|
||||
INTEL=OFF
|
||||
|
||||
# Build for MIC architecture:
|
||||
|
||||
INTEL_XEON_PHI=OFF
|
||||
|
||||
# Build with HWLOC at location:
|
||||
|
||||
#HWLOC_BASE_DIR=""
|
||||
#HWLOC_BASE_DIR="/home/projects/hwloc/1.7.1/host/gnu/4.4.7"
|
||||
HWLOC_BASE_DIR="/home/projects/hwloc/1.7.1/host/gnu/4.7.3"
|
||||
|
||||
# Location for MPI to use in examples:
|
||||
|
||||
#MPI_BASE_DIR=""
|
||||
#MPI_BASE_DIR="/home/projects/mvapich/2.0.0b/gnu/4.4.7"
|
||||
MPI_BASE_DIR="/home/projects/mvapich/2.0.0b/gnu/4.7.3"
|
||||
#MPI_BASE_DIR="/home/projects/openmpi/1.7.3/llvm/2013-12-02/"
|
||||
|
||||
#-----------------------------------------------------------------------------
|
||||
# MPI configuation only used for examples:
|
||||
#
|
||||
# Must have the MPI_BASE_DIR so that the
|
||||
# include path can be passed to the Cuda compiler
|
||||
|
||||
if [ -n "${MPI_BASE_DIR}" ] ;
|
||||
then
|
||||
CMAKE_CONFIGURE="${CMAKE_CONFIGURE} -D TPL_ENABLE_MPI:BOOL=ON"
|
||||
CMAKE_CONFIGURE="${CMAKE_CONFIGURE} -D MPI_BASE_DIR:PATH=${MPI_BASE_DIR}"
|
||||
else
|
||||
CMAKE_CONFIGURE="${CMAKE_CONFIGURE} -D TPL_ENABLE_MPI:BOOL=OFF"
|
||||
fi
|
||||
|
||||
#-----------------------------------------------------------------------------
|
||||
# Pthread configuation:
|
||||
|
||||
if [ "${PTHREADS}" = "ON" ] ;
|
||||
then
|
||||
CMAKE_CONFIGURE="${CMAKE_CONFIGURE} -D TPL_ENABLE_Pthread:BOOL=ON"
|
||||
else
|
||||
CMAKE_CONFIGURE="${CMAKE_CONFIGURE} -D TPL_ENABLE_Pthread:BOOL=OFF"
|
||||
fi
|
||||
|
||||
#-----------------------------------------------------------------------------
|
||||
# OpenMP configuation:
|
||||
|
||||
if [ "${OPENMP}" = "ON" ] ;
|
||||
then
|
||||
CMAKE_CONFIGURE="${CMAKE_CONFIGURE} -D Trilinos_ENABLE_OpenMP:BOOL=ON"
|
||||
else
|
||||
CMAKE_CONFIGURE="${CMAKE_CONFIGURE} -D Trilinos_ENABLE_OpenMP:BOOL=OFF"
|
||||
fi
|
||||
|
||||
#-----------------------------------------------------------------------------
|
||||
#-----------------------------------------------------------------------------
|
||||
# Configure packages for kokkos-only:
|
||||
|
||||
CMAKE_CONFIGURE="${CMAKE_CONFIGURE} -D Trilinos_ENABLE_ALL_PACKAGES:BOOL=OFF"
|
||||
CMAKE_CONFIGURE="${CMAKE_CONFIGURE} -D Trilinos_ENABLE_EXAMPLES:BOOL=ON"
|
||||
CMAKE_CONFIGURE="${CMAKE_CONFIGURE} -D Trilinos_ENABLE_TESTS:BOOL=ON"
|
||||
CMAKE_CONFIGURE="${CMAKE_CONFIGURE} -D Trilinos_ENABLE_KokkosCore:BOOL=ON"
|
||||
CMAKE_CONFIGURE="${CMAKE_CONFIGURE} -D Trilinos_ENABLE_KokkosContainers:BOOL=ON"
|
||||
CMAKE_CONFIGURE="${CMAKE_CONFIGURE} -D Trilinos_ENABLE_TpetraKernels:BOOL=ON"
|
||||
CMAKE_CONFIGURE="${CMAKE_CONFIGURE} -D Trilinos_ENABLE_KokkosExample:BOOL=ON"
|
||||
|
||||
#-----------------------------------------------------------------------------
|
||||
#-----------------------------------------------------------------------------
|
||||
# Hardware locality cmake configuration:
|
||||
|
||||
if [ -n "${HWLOC_BASE_DIR}" ] ;
|
||||
then
|
||||
CMAKE_CONFIGURE="${CMAKE_CONFIGURE} -D TPL_ENABLE_HWLOC:BOOL=ON"
|
||||
CMAKE_CONFIGURE="${CMAKE_CONFIGURE} -D HWLOC_INCLUDE_DIRS:FILEPATH=${HWLOC_BASE_DIR}/include"
|
||||
CMAKE_CONFIGURE="${CMAKE_CONFIGURE} -D HWLOC_LIBRARY_DIRS:FILEPATH=${HWLOC_BASE_DIR}/lib"
|
||||
fi
|
||||
|
||||
#-----------------------------------------------------------------------------
|
||||
# Cuda cmake configuration:
|
||||
|
||||
if [ -n "${CUDA_ARCH}" ] ;
|
||||
then
|
||||
|
||||
# Options to CUDA_NVCC_FLAGS must be semi-colon delimited,
|
||||
# this is different than the standard CMAKE_CXX_FLAGS syntax.
|
||||
|
||||
CUDA_NVCC_FLAGS="-gencode;arch=compute_${CUDA_ARCH},code=sm_${CUDA_ARCH}"
|
||||
|
||||
if [ "${OPENMP}" = "ON" ] ;
|
||||
then
|
||||
CUDA_NVCC_FLAGS="${CUDA_NVCC_FLAGS};-Xcompiler;-Wall,-ansi,-fopenmp"
|
||||
else
|
||||
CUDA_NVCC_FLAGS="${CUDA_NVCC_FLAGS};-Xcompiler;-Wall,-ansi"
|
||||
fi
|
||||
|
||||
if [ "${CMAKE_BUILD_TYPE}" = "DEBUG" ] ;
|
||||
then
|
||||
CUDA_NVCC_FLAGS="${CUDA_NVCC_FLAGS};-g"
|
||||
else
|
||||
CUDA_NVCC_FLAGS="${CUDA_NVCC_FLAGS};-O3"
|
||||
fi
|
||||
|
||||
CMAKE_CONFIGURE="${CMAKE_CONFIGURE} -D TPL_ENABLE_CUDA:BOOL=ON"
|
||||
CMAKE_CONFIGURE="${CMAKE_CONFIGURE} -D TPL_ENABLE_CUSPARSE:BOOL=ON"
|
||||
CMAKE_CONFIGURE="${CMAKE_CONFIGURE} -D CUDA_VERBOSE_BUILD:BOOL=OFF"
|
||||
CMAKE_CONFIGURE="${CMAKE_CONFIGURE} -D CUDA_NVCC_FLAGS:STRING=${CUDA_NVCC_FLAGS}"
|
||||
|
||||
fi
|
||||
|
||||
#-----------------------------------------------------------------------------
|
||||
|
||||
if [ "${INTEL}" = "ON" -o "${INTEL_XEON_PHI}" = "ON" ] ;
|
||||
then
|
||||
CMAKE_CONFIGURE="${CMAKE_CONFIGURE} -D CMAKE_C_COMPILER=icc"
|
||||
CMAKE_CONFIGURE="${CMAKE_CONFIGURE} -D CMAKE_CXX_COMPILER=icpc"
|
||||
fi
|
||||
|
||||
#-----------------------------------------------------------------------------
|
||||
|
||||
# Cross-compile for Intel Xeon Phi:
|
||||
|
||||
if [ "${INTEL_XEON_PHI}" = "ON" ] ;
|
||||
then
|
||||
|
||||
CMAKE_CONFIGURE="${CMAKE_CONFIGURE} -D CMAKE_SYSTEM_NAME=Linux"
|
||||
CMAKE_CONFIGURE="${CMAKE_CONFIGURE} -D CMAKE_CXX_FLAGS:STRING=-mmic"
|
||||
CMAKE_CONFIGURE="${CMAKE_CONFIGURE} -D CMAKE_C_FLAGS:STRING=-mmic"
|
||||
CMAKE_CONFIGURE="${CMAKE_CONFIGURE} -D CMAKE_Fortran_COMPILER:FILEPATH=ifort"
|
||||
CMAKE_CONFIGURE="${CMAKE_CONFIGURE} -D BLAS_LIBRARY_DIRS:FILEPATH=${MKLROOT}/lib/mic"
|
||||
CMAKE_CONFIGURE="${CMAKE_CONFIGURE} -D BLAS_LIBRARY_NAMES='mkl_intel_lp64;mkl_sequential;mkl_core;pthread;m'"
|
||||
|
||||
CMAKE_CONFIGURE="${CMAKE_CONFIGURE} -D Trilinos_ENABLE_CHECKED_STL:BOOL=OFF"
|
||||
CMAKE_CONFIGURE="${CMAKE_CONFIGURE} -D Trilinos_WARNINGS_AS_ERRORS_FLAGS:STRING=''"
|
||||
CMAKE_CONFIGURE="${CMAKE_CONFIGURE} -D BUILD_SHARED_LIBS:BOOL=OFF"
|
||||
CMAKE_CONFIGURE="${CMAKE_CONFIGURE} -D DART_TESTING_TIMEOUT:STRING=600"
|
||||
|
||||
CMAKE_CONFIGURE="${CMAKE_CONFIGURE} -D LAPACK_LIBRARY_NAMES=''"
|
||||
CMAKE_CONFIGURE="${CMAKE_CONFIGURE} -D TPL_LAPACK_LIBRARIES=''"
|
||||
CMAKE_CONFIGURE="${CMAKE_CONFIGURE} -D TPL_ENABLE_BinUtils=OFF"
|
||||
CMAKE_CONFIGURE="${CMAKE_CONFIGURE} -D TPL_Pthread_LIBRARIES=pthread"
|
||||
|
||||
# Cannot cross-compile fortran compatibility checks on the MIC:
|
||||
CMAKE_CONFIGURE="${CMAKE_CONFIGURE} -D Trilinos_ENABLE_Fortran:BOOL=OFF"
|
||||
|
||||
# Tell cmake the answers to compile-and-execute tests
|
||||
# to prevent cmake from executing a cross-compiled program.
|
||||
CMAKE_CONFIGURE="${CMAKE_CONFIGURE} -D HAVE_GCC_ABI_DEMANGLE_EXITCODE=0"
|
||||
CMAKE_CONFIGURE="${CMAKE_CONFIGURE} -D HAVE_TEUCHOS_BLASFLOAT_EXITCODE=0"
|
||||
CMAKE_CONFIGURE="${CMAKE_CONFIGURE} -D LAPACK_SLAPY2_WORKS_EXITCODE=0"
|
||||
|
||||
fi
|
||||
|
||||
#-----------------------------------------------------------------------------
|
||||
|
||||
CMAKE_CONFIGURE="${CMAKE_CONFIGURE} -D CMAKE_BUILD_TYPE:STRING=${CMAKE_BUILD_TYPE}"
|
||||
CMAKE_CONFIGURE="${CMAKE_CONFIGURE} -D CMAKE_VERBOSE_MAKEFILE:BOOL=${CMAKE_VERBOSE_MAKEFILE}"
|
||||
|
||||
#-----------------------------------------------------------------------------
|
||||
|
||||
echo "cmake ${CMAKE_CONFIGURE} ${CMAKE_PROJECT_DIR}"
|
||||
|
||||
cmake ${CMAKE_CONFIGURE} ${CMAKE_PROJECT_DIR}
|
||||
|
||||
#-----------------------------------------------------------------------------
|
||||
|
|
@ -1,204 +0,0 @@
|
|||
#!/bin/sh
|
||||
#
|
||||
# Copy this script, put it outside the Trilinos source directory, and
|
||||
# build there.
|
||||
#
|
||||
# Additional command-line arguments given to this script will be
|
||||
# passed directly to CMake.
|
||||
#
|
||||
|
||||
#
|
||||
# Force CMake to re-evaluate build options.
|
||||
#
|
||||
rm -rf CMake* Trilinos* packages Dart* Testing cmake_install.cmake MakeFile*
|
||||
|
||||
#-----------------------------------------------------------------------------
|
||||
# Incrementally construct cmake configure options:
|
||||
|
||||
CMAKE_CONFIGURE=""
|
||||
|
||||
#-----------------------------------------------------------------------------
|
||||
# Location of Trilinos source tree:
|
||||
|
||||
CMAKE_PROJECT_DIR="${HOME}/Trilinos"
|
||||
|
||||
# Location for installation:
|
||||
|
||||
CMAKE_CONFIGURE="${CMAKE_CONFIGURE} -D CMAKE_INSTALL_PREFIX=/home/sems/common/kokkos/`date +%F`"
|
||||
|
||||
#-----------------------------------------------------------------------------
|
||||
# General build options.
|
||||
# Use a variable so options can be propagated to CUDA compiler.
|
||||
|
||||
CMAKE_VERBOSE_MAKEFILE=OFF
|
||||
CMAKE_BUILD_TYPE=RELEASE
|
||||
# CMAKE_BUILD_TYPE=DEBUG
|
||||
|
||||
#-----------------------------------------------------------------------------
|
||||
# Build for CUDA architecture:
|
||||
|
||||
# CUDA_ARCH=""
|
||||
CUDA_ARCH="20"
|
||||
# CUDA_ARCH="30"
|
||||
# CUDA_ARCH="35"
|
||||
|
||||
# Build with OpenMP
|
||||
|
||||
OPENMP=ON
|
||||
|
||||
# Build host code with Intel compiler:
|
||||
|
||||
# INTEL=ON
|
||||
|
||||
# Build for MIC architecture:
|
||||
|
||||
# INTEL_XEON_PHI=ON
|
||||
|
||||
# Build with HWLOC at location:
|
||||
|
||||
HWLOC_BASE_DIR="/home/sems/common/hwloc/current"
|
||||
|
||||
# Location for MPI to use in examples:
|
||||
|
||||
MPI_BASE_DIR="/home/sems/common/openmpi/current"
|
||||
|
||||
#-----------------------------------------------------------------------------
|
||||
# MPI configuation only used for examples:
|
||||
#
|
||||
# Must have the MPI_BASE_DIR so that the
|
||||
# include path can be passed to the Cuda compiler
|
||||
|
||||
if [ -n "${MPI_BASE_DIR}" ] ;
|
||||
then
|
||||
CMAKE_CONFIGURE="${CMAKE_CONFIGURE} -D TPL_ENABLE_MPI:BOOL=ON"
|
||||
CMAKE_CONFIGURE="${CMAKE_CONFIGURE} -D MPI_BASE_DIR:PATH=${MPI_BASE_DIR}"
|
||||
else
|
||||
CMAKE_CONFIGURE="${CMAKE_CONFIGURE} -D TPL_ENABLE_MPI:BOOL=OFF"
|
||||
fi
|
||||
|
||||
#-----------------------------------------------------------------------------
|
||||
# Pthread configuation:
|
||||
|
||||
CMAKE_CONFIGURE="${CMAKE_CONFIGURE} -D TPL_ENABLE_Pthread:BOOL=ON"
|
||||
# CMAKE_CONFIGURE="${CMAKE_CONFIGURE} -D TPL_ENABLE_Pthread:BOOL=OFF"
|
||||
|
||||
#-----------------------------------------------------------------------------
|
||||
# OpenMP configuation:
|
||||
|
||||
if [ "${OPENMP}" = "ON" ] ;
|
||||
then
|
||||
CMAKE_CONFIGURE="${CMAKE_CONFIGURE} -D Trilinos_ENABLE_OpenMP:BOOL=ON"
|
||||
else
|
||||
CMAKE_CONFIGURE="${CMAKE_CONFIGURE} -D Trilinos_ENABLE_OpenMP:BOOL=OFF"
|
||||
fi
|
||||
|
||||
#-----------------------------------------------------------------------------
|
||||
#-----------------------------------------------------------------------------
|
||||
# Configure packages for kokkos-only:
|
||||
|
||||
CMAKE_CONFIGURE="${CMAKE_CONFIGURE} -D Trilinos_ENABLE_ALL_PACKAGES:BOOL=OFF"
|
||||
CMAKE_CONFIGURE="${CMAKE_CONFIGURE} -D Trilinos_ENABLE_EXAMPLES:BOOL=ON"
|
||||
CMAKE_CONFIGURE="${CMAKE_CONFIGURE} -D Trilinos_ENABLE_TESTS:BOOL=ON"
|
||||
CMAKE_CONFIGURE="${CMAKE_CONFIGURE} -D Trilinos_ENABLE_KokkosCore:BOOL=ON"
|
||||
CMAKE_CONFIGURE="${CMAKE_CONFIGURE} -D Trilinos_ENABLE_KokkosContainers:BOOL=ON"
|
||||
CMAKE_CONFIGURE="${CMAKE_CONFIGURE} -D Trilinos_ENABLE_KokkosExample:BOOL=ON"
|
||||
|
||||
#-----------------------------------------------------------------------------
|
||||
#-----------------------------------------------------------------------------
|
||||
# Hardware locality cmake configuration:
|
||||
|
||||
if [ -n "${HWLOC_BASE_DIR}" ] ;
|
||||
then
|
||||
CMAKE_CONFIGURE="${CMAKE_CONFIGURE} -D TPL_ENABLE_HWLOC:BOOL=ON"
|
||||
CMAKE_CONFIGURE="${CMAKE_CONFIGURE} -D HWLOC_INCLUDE_DIRS:FILEPATH=${HWLOC_BASE_DIR}/include"
|
||||
CMAKE_CONFIGURE="${CMAKE_CONFIGURE} -D HWLOC_LIBRARY_DIRS:FILEPATH=${HWLOC_BASE_DIR}/lib"
|
||||
fi
|
||||
|
||||
#-----------------------------------------------------------------------------
|
||||
# Cuda cmake configuration:
|
||||
|
||||
if [ -n "${CUDA_ARCH}" ] ;
|
||||
then
|
||||
|
||||
# Options to CUDA_NVCC_FLAGS must be semi-colon delimited,
|
||||
# this is different than the standard CMAKE_CXX_FLAGS syntax.
|
||||
|
||||
CUDA_NVCC_FLAGS="-gencode;arch=compute_${CUDA_ARCH},code=sm_${CUDA_ARCH}"
|
||||
|
||||
if [ "${OPENMP}" = "ON" ] ;
|
||||
then
|
||||
CUDA_NVCC_FLAGS="${CUDA_NVCC_FLAGS};-Xcompiler;-Wall,-ansi,-fopenmp"
|
||||
else
|
||||
CUDA_NVCC_FLAGS="${CUDA_NVCC_FLAGS};-Xcompiler;-Wall,-ansi"
|
||||
fi
|
||||
|
||||
if [ "${CMAKE_BUILD_TYPE}" = "DEBUG" ] ;
|
||||
then
|
||||
CUDA_NVCC_FLAGS="${CUDA_NVCC_FLAGS};-g"
|
||||
else
|
||||
CUDA_NVCC_FLAGS="${CUDA_NVCC_FLAGS};-O3"
|
||||
fi
|
||||
|
||||
CMAKE_CONFIGURE="${CMAKE_CONFIGURE} -D TPL_ENABLE_CUDA:BOOL=ON"
|
||||
CMAKE_CONFIGURE="${CMAKE_CONFIGURE} -D TPL_ENABLE_CUSPARSE:BOOL=ON"
|
||||
CMAKE_CONFIGURE="${CMAKE_CONFIGURE} -D CUDA_VERBOSE_BUILD:BOOL=OFF"
|
||||
CMAKE_CONFIGURE="${CMAKE_CONFIGURE} -D CUDA_NVCC_FLAGS:STRING=${CUDA_NVCC_FLAGS}"
|
||||
|
||||
fi
|
||||
|
||||
#-----------------------------------------------------------------------------
|
||||
|
||||
if [ "${INTEL}" = "ON" -o "${INTEL_XEON_PHI}" = "ON" ] ;
|
||||
then
|
||||
CMAKE_CONFIGURE="${CMAKE_CONFIGURE} -D CMAKE_C_COMPILER=icc"
|
||||
CMAKE_CONFIGURE="${CMAKE_CONFIGURE} -D CMAKE_CXX_COMPILER=icpc"
|
||||
fi
|
||||
|
||||
#-----------------------------------------------------------------------------
|
||||
|
||||
# Cross-compile for Intel Xeon Phi:
|
||||
|
||||
if [ "${INTEL_XEON_PHI}" = "ON" ] ;
|
||||
then
|
||||
|
||||
CMAKE_CONFIGURE="${CMAKE_CONFIGURE} -D CMAKE_SYSTEM_NAME=Linux"
|
||||
CMAKE_CONFIGURE="${CMAKE_CONFIGURE} -D CMAKE_CXX_FLAGS:STRING=-mmic"
|
||||
CMAKE_CONFIGURE="${CMAKE_CONFIGURE} -D CMAKE_C_FLAGS:STRING=-mmic"
|
||||
CMAKE_CONFIGURE="${CMAKE_CONFIGURE} -D CMAKE_Fortran_COMPILER:FILEPATH=ifort"
|
||||
CMAKE_CONFIGURE="${CMAKE_CONFIGURE} -D BLAS_LIBRARY_DIRS:FILEPATH=${MKLROOT}/lib/mic"
|
||||
CMAKE_CONFIGURE="${CMAKE_CONFIGURE} -D BLAS_LIBRARY_NAMES='mkl_intel_lp64;mkl_sequential;mkl_core;pthread;m'"
|
||||
|
||||
CMAKE_CONFIGURE="${CMAKE_CONFIGURE} -D Trilinos_ENABLE_CHECKED_STL:BOOL=OFF"
|
||||
CMAKE_CONFIGURE="${CMAKE_CONFIGURE} -D Trilinos_WARNINGS_AS_ERRORS_FLAGS:STRING=''"
|
||||
CMAKE_CONFIGURE="${CMAKE_CONFIGURE} -D BUILD_SHARED_LIBS:BOOL=OFF"
|
||||
CMAKE_CONFIGURE="${CMAKE_CONFIGURE} -D DART_TESTING_TIMEOUT:STRING=600"
|
||||
|
||||
CMAKE_CONFIGURE="${CMAKE_CONFIGURE} -D LAPACK_LIBRARY_NAMES=''"
|
||||
CMAKE_CONFIGURE="${CMAKE_CONFIGURE} -D TPL_LAPACK_LIBRARIES=''"
|
||||
CMAKE_CONFIGURE="${CMAKE_CONFIGURE} -D TPL_ENABLE_BinUtils=OFF"
|
||||
CMAKE_CONFIGURE="${CMAKE_CONFIGURE} -D TPL_Pthread_LIBRARIES=pthread"
|
||||
|
||||
# Cannot cross-compile fortran compatibility checks on the MIC:
|
||||
CMAKE_CONFIGURE="${CMAKE_CONFIGURE} -D Trilinos_ENABLE_Fortran:BOOL=OFF"
|
||||
|
||||
# Tell cmake the answers to compile-and-execute tests
|
||||
# to prevent cmake from executing a cross-compiled program.
|
||||
CMAKE_CONFIGURE="${CMAKE_CONFIGURE} -D HAVE_GCC_ABI_DEMANGLE_EXITCODE=0"
|
||||
CMAKE_CONFIGURE="${CMAKE_CONFIGURE} -D HAVE_TEUCHOS_BLASFLOAT_EXITCODE=0"
|
||||
CMAKE_CONFIGURE="${CMAKE_CONFIGURE} -D LAPACK_SLAPY2_WORKS_EXITCODE=0"
|
||||
|
||||
fi
|
||||
|
||||
#-----------------------------------------------------------------------------
|
||||
|
||||
CMAKE_CONFIGURE="${CMAKE_CONFIGURE} -D CMAKE_BUILD_TYPE:STRING=${CMAKE_BUILD_TYPE}"
|
||||
CMAKE_CONFIGURE="${CMAKE_CONFIGURE} -D CMAKE_VERBOSE_MAKEFILE:BOOL=${CMAKE_VERBOSE_MAKEFILE}"
|
||||
|
||||
#-----------------------------------------------------------------------------
|
||||
|
||||
echo "cmake ${CMAKE_CONFIGURE} ${CMAKE_PROJECT_DIR}"
|
||||
|
||||
cmake ${CMAKE_CONFIGURE} ${CMAKE_PROJECT_DIR}
|
||||
|
||||
#-----------------------------------------------------------------------------
|
||||
|
|
@ -1,190 +0,0 @@
|
|||
#!/bin/sh
|
||||
#
|
||||
# Copy this script, put it outside the Trilinos source directory, and
|
||||
# build there.
|
||||
#
|
||||
# Additional command-line arguments given to this script will be
|
||||
# passed directly to CMake.
|
||||
#
|
||||
|
||||
#
|
||||
# Force CMake to re-evaluate build options.
|
||||
#
|
||||
rm -rf CMake* Trilinos* packages Dart* Testing cmake_install.cmake MakeFile*
|
||||
|
||||
#-----------------------------------------------------------------------------
|
||||
# Incrementally construct cmake configure options:
|
||||
|
||||
CMAKE_CONFIGURE=""
|
||||
|
||||
#-----------------------------------------------------------------------------
|
||||
# Location of Trilinos source tree:
|
||||
|
||||
CMAKE_PROJECT_DIR="${HOME}/Trilinos"
|
||||
|
||||
# Location for installation:
|
||||
|
||||
CMAKE_CONFIGURE="${CMAKE_CONFIGURE} -D CMAKE_INSTALL_PREFIX=/home/projects/kokkos/`date +%F`"
|
||||
|
||||
#-----------------------------------------------------------------------------
|
||||
# General build options.
|
||||
# Use a variable so options can be propagated to CUDA compiler.
|
||||
|
||||
CMAKE_VERBOSE_MAKEFILE=OFF
|
||||
CMAKE_BUILD_TYPE=RELEASE
|
||||
# CMAKE_BUILD_TYPE=DEBUG
|
||||
|
||||
#-----------------------------------------------------------------------------
|
||||
# Build for CUDA architecture:
|
||||
|
||||
# CUDA_ARCH=""
|
||||
# CUDA_ARCH="20"
|
||||
# CUDA_ARCH="30"
|
||||
CUDA_ARCH="35"
|
||||
|
||||
# Build host code with Intel compiler:
|
||||
|
||||
INTEL=ON
|
||||
|
||||
# Build for MIC architecture:
|
||||
|
||||
# INTEL_XEON_PHI=ON
|
||||
|
||||
# Build with HWLOC at location:
|
||||
|
||||
HWLOC_BASE_DIR="/home/projects/hwloc/1.6.2"
|
||||
|
||||
# Location for MPI to use in examples:
|
||||
|
||||
MPI_BASE_DIR=""
|
||||
|
||||
#-----------------------------------------------------------------------------
|
||||
# MPI configuation only used for examples:
|
||||
#
|
||||
# Must have the MPI_BASE_DIR so that the
|
||||
# include path can be passed to the Cuda compiler
|
||||
|
||||
if [ -n "${MPI_BASE_DIR}" ] ;
|
||||
then
|
||||
CMAKE_CONFIGURE="${CMAKE_CONFIGURE} -D TPL_ENABLE_MPI:BOOL=ON"
|
||||
CMAKE_CONFIGURE="${CMAKE_CONFIGURE} -D MPI_BASE_DIR:PATH=${MPI_BASE_DIR}"
|
||||
else
|
||||
CMAKE_CONFIGURE="${CMAKE_CONFIGURE} -D TPL_ENABLE_MPI:BOOL=OFF"
|
||||
fi
|
||||
|
||||
#-----------------------------------------------------------------------------
|
||||
# Pthread configuation:
|
||||
|
||||
CMAKE_CONFIGURE="${CMAKE_CONFIGURE} -D TPL_ENABLE_Pthread:BOOL=ON"
|
||||
# CMAKE_CONFIGURE="${CMAKE_CONFIGURE} -D TPL_ENABLE_Pthread:BOOL=OFF"
|
||||
|
||||
#-----------------------------------------------------------------------------
|
||||
# OpenMP configuation:
|
||||
|
||||
# CMAKE_CONFIGURE="${CMAKE_CONFIGURE} -D Trilinos_ENABLE_OpenMP:BOOL=ON"
|
||||
CMAKE_CONFIGURE="${CMAKE_CONFIGURE} -D Trilinos_ENABLE_OpenMP:BOOL=OFF"
|
||||
|
||||
#-----------------------------------------------------------------------------
|
||||
#-----------------------------------------------------------------------------
|
||||
# Configure packages for kokkos-only:
|
||||
|
||||
CMAKE_CONFIGURE="${CMAKE_CONFIGURE} -D Trilinos_ENABLE_ALL_PACKAGES:BOOL=OFF"
|
||||
CMAKE_CONFIGURE="${CMAKE_CONFIGURE} -D Trilinos_ENABLE_EXAMPLES:BOOL=ON"
|
||||
CMAKE_CONFIGURE="${CMAKE_CONFIGURE} -D Trilinos_ENABLE_TESTS:BOOL=ON"
|
||||
CMAKE_CONFIGURE="${CMAKE_CONFIGURE} -D Trilinos_ENABLE_KokkosCore:BOOL=ON"
|
||||
CMAKE_CONFIGURE="${CMAKE_CONFIGURE} -D Trilinos_ENABLE_KokkosContainers:BOOL=ON"
|
||||
CMAKE_CONFIGURE="${CMAKE_CONFIGURE} -D Trilinos_ENABLE_KokkosExample:BOOL=ON"
|
||||
|
||||
#-----------------------------------------------------------------------------
|
||||
#-----------------------------------------------------------------------------
|
||||
# Hardware locality cmake configuration:
|
||||
|
||||
if [ -n "${HWLOC_BASE_DIR}" ] ;
|
||||
then
|
||||
CMAKE_CONFIGURE="${CMAKE_CONFIGURE} -D TPL_ENABLE_HWLOC:BOOL=ON"
|
||||
CMAKE_CONFIGURE="${CMAKE_CONFIGURE} -D HWLOC_INCLUDE_DIRS:FILEPATH=${HWLOC_BASE_DIR}/include"
|
||||
CMAKE_CONFIGURE="${CMAKE_CONFIGURE} -D HWLOC_LIBRARY_DIRS:FILEPATH=${HWLOC_BASE_DIR}/lib"
|
||||
fi
|
||||
|
||||
#-----------------------------------------------------------------------------
|
||||
# Cuda cmake configuration:
|
||||
|
||||
if [ -n "${CUDA_ARCH}" ] ;
|
||||
then
|
||||
|
||||
# Options to CUDA_NVCC_FLAGS must be semi-colon delimited,
|
||||
# this is different than the standard CMAKE_CXX_FLAGS syntax.
|
||||
|
||||
CUDA_NVCC_FLAGS="-gencode;arch=compute_${CUDA_ARCH},code=sm_${CUDA_ARCH}"
|
||||
CUDA_NVCC_FLAGS="${CUDA_NVCC_FLAGS};-Xcompiler;-Wall,-ansi"
|
||||
|
||||
if [ "${CMAKE_BUILD_TYPE}" = "DEBUG" ] ;
|
||||
then
|
||||
CUDA_NVCC_FLAGS="${CUDA_NVCC_FLAGS};-g"
|
||||
else
|
||||
CUDA_NVCC_FLAGS="${CUDA_NVCC_FLAGS};-O3"
|
||||
fi
|
||||
|
||||
CMAKE_CONFIGURE="${CMAKE_CONFIGURE} -D TPL_ENABLE_CUDA:BOOL=ON"
|
||||
CMAKE_CONFIGURE="${CMAKE_CONFIGURE} -D TPL_ENABLE_CUSPARSE:BOOL=ON"
|
||||
CMAKE_CONFIGURE="${CMAKE_CONFIGURE} -D CUDA_VERBOSE_BUILD:BOOL=OFF"
|
||||
CMAKE_CONFIGURE="${CMAKE_CONFIGURE} -D CUDA_NVCC_FLAGS:STRING=${CUDA_NVCC_FLAGS}"
|
||||
|
||||
fi
|
||||
|
||||
#-----------------------------------------------------------------------------
|
||||
|
||||
if [ "${INTEL}" = "ON" -o "${INTEL_XEON_PHI}" = "ON" ] ;
|
||||
then
|
||||
CMAKE_CONFIGURE="${CMAKE_CONFIGURE} -D CMAKE_C_COMPILER=icc"
|
||||
CMAKE_CONFIGURE="${CMAKE_CONFIGURE} -D CMAKE_CXX_COMPILER=icpc"
|
||||
fi
|
||||
|
||||
#-----------------------------------------------------------------------------
|
||||
|
||||
# Cross-compile for Intel Xeon Phi:
|
||||
|
||||
if [ "${INTEL_XEON_PHI}" = "ON" ] ;
|
||||
then
|
||||
|
||||
CMAKE_CONFIGURE="${CMAKE_CONFIGURE} -D CMAKE_SYSTEM_NAME=Linux"
|
||||
CMAKE_CONFIGURE="${CMAKE_CONFIGURE} -D CMAKE_CXX_FLAGS:STRING=-mmic"
|
||||
CMAKE_CONFIGURE="${CMAKE_CONFIGURE} -D CMAKE_C_FLAGS:STRING=-mmic"
|
||||
CMAKE_CONFIGURE="${CMAKE_CONFIGURE} -D CMAKE_Fortran_COMPILER:FILEPATH=ifort"
|
||||
CMAKE_CONFIGURE="${CMAKE_CONFIGURE} -D BLAS_LIBRARY_DIRS:FILEPATH=${MKLROOT}/lib/mic"
|
||||
CMAKE_CONFIGURE="${CMAKE_CONFIGURE} -D BLAS_LIBRARY_NAMES='mkl_intel_lp64;mkl_sequential;mkl_core;pthread;m'"
|
||||
|
||||
CMAKE_CONFIGURE="${CMAKE_CONFIGURE} -D Trilinos_ENABLE_CHECKED_STL:BOOL=OFF"
|
||||
CMAKE_CONFIGURE="${CMAKE_CONFIGURE} -D Trilinos_WARNINGS_AS_ERRORS_FLAGS:STRING=''"
|
||||
CMAKE_CONFIGURE="${CMAKE_CONFIGURE} -D BUILD_SHARED_LIBS:BOOL=OFF"
|
||||
CMAKE_CONFIGURE="${CMAKE_CONFIGURE} -D DART_TESTING_TIMEOUT:STRING=600"
|
||||
|
||||
CMAKE_CONFIGURE="${CMAKE_CONFIGURE} -D LAPACK_LIBRARY_NAMES=''"
|
||||
CMAKE_CONFIGURE="${CMAKE_CONFIGURE} -D TPL_LAPACK_LIBRARIES=''"
|
||||
CMAKE_CONFIGURE="${CMAKE_CONFIGURE} -D TPL_ENABLE_BinUtils=OFF"
|
||||
CMAKE_CONFIGURE="${CMAKE_CONFIGURE} -D TPL_Pthread_LIBRARIES=pthread"
|
||||
|
||||
# Cannot cross-compile fortran compatibility checks on the MIC:
|
||||
CMAKE_CONFIGURE="${CMAKE_CONFIGURE} -D Trilinos_ENABLE_Fortran:BOOL=OFF"
|
||||
|
||||
# Tell cmake the answers to compile-and-execute tests
|
||||
# to prevent cmake from executing a cross-compiled program.
|
||||
CMAKE_CONFIGURE="${CMAKE_CONFIGURE} -D HAVE_GCC_ABI_DEMANGLE_EXITCODE=0"
|
||||
CMAKE_CONFIGURE="${CMAKE_CONFIGURE} -D HAVE_TEUCHOS_BLASFLOAT_EXITCODE=0"
|
||||
CMAKE_CONFIGURE="${CMAKE_CONFIGURE} -D LAPACK_SLAPY2_WORKS_EXITCODE=0"
|
||||
|
||||
fi
|
||||
|
||||
#-----------------------------------------------------------------------------
|
||||
|
||||
CMAKE_CONFIGURE="${CMAKE_CONFIGURE} -D CMAKE_BUILD_TYPE:STRING=${CMAKE_BUILD_TYPE}"
|
||||
CMAKE_CONFIGURE="${CMAKE_CONFIGURE} -D CMAKE_VERBOSE_MAKEFILE:BOOL=${CMAKE_VERBOSE_MAKEFILE}"
|
||||
|
||||
#-----------------------------------------------------------------------------
|
||||
|
||||
echo "cmake ${CMAKE_CONFIGURE} ${CMAKE_PROJECT_DIR}"
|
||||
|
||||
cmake ${CMAKE_CONFIGURE} ${CMAKE_PROJECT_DIR}
|
||||
|
||||
#-----------------------------------------------------------------------------
|
||||
|
|
@ -1,140 +0,0 @@
|
|||
#!/bin/bash
|
||||
#
|
||||
# This script uses CUDA, OpenMP, and MPI.
|
||||
#
|
||||
# Before invoking this script, set the OMPI_CXX environment variable
|
||||
# to point to nvcc_wrapper, wherever it happens to live. (If you use
|
||||
# an MPI implementation other than OpenMPI, set the corresponding
|
||||
# environment variable instead.)
|
||||
#
|
||||
|
||||
rm -f CMakeCache.txt;
|
||||
rm -rf CMakeFiles
|
||||
EXTRA_ARGS=$@
|
||||
MPI_PATH="/opt/mpi/openmpi/1.8.2/nvcc-gcc/4.8.3-6.5"
|
||||
CUDA_PATH="/opt/nvidia/cuda/6.5.14"
|
||||
|
||||
#
|
||||
# As long as there are any .cu files in Trilinos, we'll need to set
|
||||
# CUDA_NVCC_FLAGS. If Trilinos gets rid of all of its .cu files and
|
||||
# lets nvcc_wrapper handle them as .cpp files, then we won't need to
|
||||
# set CUDA_NVCC_FLAGS. As it is, given that we need to set
|
||||
# CUDA_NVCC_FLAGS, we must make sure that they are the same flags as
|
||||
# nvcc_wrapper passes to nvcc.
|
||||
#
|
||||
CUDA_NVCC_FLAGS="-gencode;arch=compute_35,code=sm_35;-I${MPI_PATH}/include"
|
||||
CUDA_NVCC_FLAGS="${CUDA_NVCC_FLAGS};-Xcompiler;-Wall,-ansi,-fopenmp"
|
||||
CUDA_NVCC_FLAGS="${CUDA_NVCC_FLAGS};-O3;-DKOKKOS_USE_CUDA_UVM"
|
||||
|
||||
cmake \
|
||||
-D CMAKE_INSTALL_PREFIX:PATH="$PWD/../install/" \
|
||||
-D CMAKE_BUILD_TYPE:STRING=DEBUG \
|
||||
-D CMAKE_CXX_FLAGS:STRING="-g -Wall" \
|
||||
-D CMAKE_C_FLAGS:STRING="-g -Wall" \
|
||||
-D CMAKE_FORTRAN_FLAGS:STRING="" \
|
||||
-D CMAKE_SHARED_LIBRARY_LINK_CXX_FLAGS="" \
|
||||
-D Trilinos_ENABLE_Triutils=OFF \
|
||||
-D Trilinos_ENABLE_INSTALL_CMAKE_CONFIG_FILES:BOOL=OFF \
|
||||
-D Trilinos_ENABLE_DEBUG:BOOL=OFF \
|
||||
-D Trilinos_ENABLE_CHECKED_STL:BOOL=OFF \
|
||||
-D Trilinos_ENABLE_EXPLICIT_INSTANTIATION:BOOL=OFF \
|
||||
-D Trilinos_WARNINGS_AS_ERRORS_FLAGS:STRING="" \
|
||||
-D Trilinos_ENABLE_ALL_PACKAGES:BOOL=OFF \
|
||||
-D Trilinos_ENABLE_ALL_OPTIONAL_PACKAGES:BOOL=OFF \
|
||||
-D BUILD_SHARED_LIBS:BOOL=OFF \
|
||||
-D DART_TESTING_TIMEOUT:STRING=600 \
|
||||
-D CMAKE_VERBOSE_MAKEFILE:BOOL=OFF \
|
||||
\
|
||||
\
|
||||
-D CMAKE_CXX_COMPILER:FILEPATH="${MPI_PATH}/bin/mpicxx" \
|
||||
-D CMAKE_C_COMPILER:FILEPATH="${MPI_PATH}/bin/mpicc" \
|
||||
-D MPI_CXX_COMPILER:FILEPATH="${MPI_PATH}/bin/mpicxx" \
|
||||
-D MPI_C_COMPILER:FILEPATH="${MPI_PATH}/bin/mpicc" \
|
||||
-D CMAKE_Fortran_COMPILER:FILEPATH="${MPI_PATH}/bin/mpif77" \
|
||||
-D MPI_EXEC:FILEPATH="${MPI_PATH}/bin/mpirun" \
|
||||
-D MPI_EXEC_POST_NUMPROCS_FLAGS:STRING="-bind-to;socket;--map-by;socket;env;CUDA_MANAGED_FORCE_DEVICE_ALLOC=1;CUDA_LAUNCH_BLOCKING=1;OMP_NUM_THREADS=2" \
|
||||
\
|
||||
\
|
||||
-D Trilinos_ENABLE_CXX11:BOOL=OFF \
|
||||
-D TPL_ENABLE_MPI:BOOL=ON \
|
||||
-D Trilinos_ENABLE_OpenMP:BOOL=ON \
|
||||
-D Trilinos_ENABLE_ThreadPool:BOOL=ON \
|
||||
\
|
||||
\
|
||||
-D TPL_ENABLE_CUDA:BOOL=ON \
|
||||
-D CUDA_TOOLKIT_ROOT_DIR:FILEPATH="${CUDA_PATH}" \
|
||||
-D CUDA_PROPAGATE_HOST_FLAGS:BOOL=OFF \
|
||||
-D TPL_ENABLE_Thrust:BOOL=OFF \
|
||||
-D Thrust_INCLUDE_DIRS:FILEPATH="${CUDA_PATH}/include" \
|
||||
-D TPL_ENABLE_CUSPARSE:BOOL=OFF \
|
||||
-D TPL_ENABLE_Cusp:BOOL=OFF \
|
||||
-D Cusp_INCLUDE_DIRS="/home/crtrott/Software/cusp" \
|
||||
-D CUDA_VERBOSE_BUILD:BOOL=OFF \
|
||||
-D CUDA_NVCC_FLAGS:STRING=${CUDA_NVCC_FLAGS} \
|
||||
\
|
||||
\
|
||||
-D TPL_ENABLE_HWLOC=OFF \
|
||||
-D HWLOC_INCLUDE_DIRS="/usr/local/software/hwloc/current/include" \
|
||||
-D HWLOC_LIBRARY_DIRS="/usr/local/software/hwloc/current/lib" \
|
||||
-D TPL_ENABLE_BinUtils=OFF \
|
||||
-D TPL_ENABLE_BLAS:STRING=ON \
|
||||
-D TPL_ENABLE_LAPACK:STRING=ON \
|
||||
-D TPL_ENABLE_MKL:STRING=OFF \
|
||||
-D TPL_ENABLE_HWLOC:STRING=OFF \
|
||||
-D TPL_ENABLE_GTEST:STRING=ON \
|
||||
-D TPL_ENABLE_SuperLU=ON \
|
||||
-D TPL_ENABLE_BLAS=ON \
|
||||
-D TPL_ENABLE_LAPACK=ON \
|
||||
-D TPL_SuperLU_LIBRARIES="/home/crtrott/Software/SuperLU_4.3/lib/libsuperlu_4.3.a" \
|
||||
-D TPL_SuperLU_INCLUDE_DIRS="/home/crtrott/Software/SuperLU_4.3/SRC" \
|
||||
\
|
||||
\
|
||||
-D Trilinos_Enable_Kokkos:BOOL=ON \
|
||||
-D Trilinos_ENABLE_KokkosCore:BOOL=ON \
|
||||
-D Trilinos_ENABLE_TeuchosKokkosCompat:BOOL=ON \
|
||||
-D Trilinos_ENABLE_KokkosContainers:BOOL=ON \
|
||||
-D Trilinos_ENABLE_TpetraKernels:BOOL=ON \
|
||||
-D Trilinos_ENABLE_KokkosAlgorithms:BOOL=ON \
|
||||
-D Trilinos_ENABLE_TeuchosKokkosComm:BOOL=ON \
|
||||
-D Trilinos_ENABLE_KokkosExample:BOOL=ON \
|
||||
-D Kokkos_ENABLE_EXAMPLES:BOOL=ON \
|
||||
-D Kokkos_ENABLE_TESTS:BOOL=OFF \
|
||||
-D KokkosClassic_DefaultNode:STRING="Kokkos::Compat::KokkosCudaWrapperNode" \
|
||||
-D TpetraClassic_ENABLE_OpenMPNode=OFF \
|
||||
-D TpetraClassic_ENABLE_TPINode=OFF \
|
||||
-D TpetraClassic_ENABLE_MKL=OFF \
|
||||
-D Kokkos_ENABLE_Cuda_UVM=ON \
|
||||
\
|
||||
\
|
||||
-D Trilinos_ENABLE_Teuchos:BOOL=ON \
|
||||
-D Teuchos_ENABLE_COMPLEX:BOOL=OFF \
|
||||
\
|
||||
\
|
||||
-D Trilinos_ENABLE_Tpetra:BOOL=ON \
|
||||
-D Tpetra_ENABLE_KokkosCore=ON \
|
||||
-D Tpetra_ENABLE_Kokkos_DistObject=OFF \
|
||||
-D Tpetra_ENABLE_Kokkos_Refactor=ON \
|
||||
-D Tpetra_ENABLE_TESTS=ON \
|
||||
-D Tpetra_ENABLE_EXAMPLES=ON \
|
||||
-D Tpetra_ENABLE_MPI_CUDA_RDMA:BOOL=ON \
|
||||
\
|
||||
\
|
||||
-D Trilinos_ENABLE_Belos=OFF \
|
||||
-D Trilinos_ENABLE_Amesos=OFF \
|
||||
-D Trilinos_ENABLE_Amesos2=OFF \
|
||||
-D Trilinos_ENABLE_Ifpack=OFF \
|
||||
-D Trilinos_ENABLE_Ifpack2=OFF \
|
||||
-D Trilinos_ENABLE_Epetra=OFF \
|
||||
-D Trilinos_ENABLE_EpetraExt=OFF \
|
||||
-D Trilinos_ENABLE_Zoltan=OFF \
|
||||
-D Trilinos_ENABLE_Zoltan2=OFF \
|
||||
-D Trilinos_ENABLE_MueLu=OFF \
|
||||
-D Belos_ENABLE_TESTS=ON \
|
||||
-D Belos_ENABLE_EXAMPLES=ON \
|
||||
-D MueLu_ENABLE_TESTS=ON \
|
||||
-D MueLu_ENABLE_EXAMPLES=ON \
|
||||
-D Ifpack2_ENABLE_TESTS=ON \
|
||||
-D Ifpack2_ENABLE_EXAMPLES=ON \
|
||||
$EXTRA_ARGS \
|
||||
${HOME}/Trilinos
|
||||
|
|
@ -1,148 +0,0 @@
|
|||
// -------------------------------------------------------------------------------- //
|
||||
|
||||
The following steps are for workstations/servers with the SEMS environment installed.
|
||||
|
||||
// -------------------------------------------------------------------------------- //
|
||||
Summary:
|
||||
|
||||
- Step 1: Rigorous testing of Kokkos' develop branch for each backend (Serial, OpenMP, Threads, Cuda) with all supported compilers.
|
||||
|
||||
- Step 2: Snapshot Kokkos' develop branch into current Trilinos develop branch.
|
||||
|
||||
- Step 3: Build and test Trilinos with combinations of compilers, types, backends.
|
||||
|
||||
- Step 4: Promote Kokkos develop branch to master if the snapshot does not cause any new tests to fail; else track/fix causes of new failures.
|
||||
|
||||
- Step 5: Snapshot Kokkos tagged master branch into Trilinos and push Trilinos.
|
||||
// -------------------------------------------------------------------------------- //
|
||||
|
||||
|
||||
// -------------------------------------------------------------------------------- //
|
||||
|
||||
Step 1:
|
||||
1.1. Update kokkos develop branch (NOT a fork)
|
||||
|
||||
(From kokkos directory):
|
||||
git fetch --all
|
||||
git checkout develop
|
||||
git reset --hard origin/develop
|
||||
|
||||
1.2. Create a testing directory - here the directory is created within the kokkos directory
|
||||
|
||||
mkdir testing
|
||||
cd testing
|
||||
|
||||
1.3. Run the test_all_sandia script; various compiler and build-list options can be specified
|
||||
|
||||
../config/test_all_sandia
|
||||
|
||||
1.4 Clean repository of untracked files
|
||||
|
||||
cd ../
|
||||
git clean -df
|
||||
|
||||
// -------------------------------------------------------------------------------- //
|
||||
|
||||
Step 2:
|
||||
2.1 Update Trilinos develop branch
|
||||
|
||||
(From Trilinos directory):
|
||||
git checkout develop
|
||||
git fetch --all
|
||||
git reset --hard origin/develop
|
||||
git clean -df
|
||||
|
||||
2.2 Snapshot Kokkos into Trilinos - this requires python/2.7.9 and that both Trilinos and Kokkos be clean - no untracked or modified files
|
||||
|
||||
module load python/2.7.9
|
||||
python KOKKOS_PATH/config/snapshot.py KOKKOS_PATH TRILINOS_PATH/packages
|
||||
|
||||
// -------------------------------------------------------------------------------- //
|
||||
|
||||
Step 3:
|
||||
3.1. Build and test Trilinos with 4 different configurations; Run scripts for white and shepard are provided in kokkos/config/trilinos-integration
|
||||
|
||||
Usually its a good idea to run those script via nohup.
|
||||
You can run all four at the same time, use separate directories for each.
|
||||
|
||||
3.2. Compare the failed test output between the pristine and the updated runs; investigate and fix problems if new tests fail after the Kokkos snapshot
|
||||
|
||||
// -------------------------------------------------------------------------------- //
|
||||
|
||||
Step 4: Once all Trilinos tests pass promote Kokkos develop branch to master on Github
|
||||
4.1. Generate Changelog (You need a github API token)
|
||||
|
||||
Close all Open issues with "InDevelop" tag on github
|
||||
|
||||
(Not from kokkos directory)
|
||||
gitthub_changelog_generator kokkos/kokkos --token TOKEN --no-pull-requests --include-labels 'InDevelop' --enhancement-labels 'enhancement,Feature Request' --future-release 'NEWTAG' --between-tags 'NEWTAG,OLDTAG'
|
||||
|
||||
(Copy the new section from the generated CHANGELOG.md to the kokkos/CHANGELOG.md)
|
||||
(Make desired changes to CHANGELOG.md to enhance clarity)
|
||||
(Commit and push the CHANGELOG to develop)
|
||||
|
||||
4.2 Merge develop into Master
|
||||
|
||||
- DO NOT fast-forward the merge!!!!
|
||||
|
||||
(From kokkos directory):
|
||||
git checkout master
|
||||
git fetch --all
|
||||
# Ensure we are on the current origin/master
|
||||
git reset --hard origin/master
|
||||
git merge --no-ff origin/develop
|
||||
|
||||
4.3. Update the tag in kokkos/config/master_history.txt
|
||||
Tag description: MajorNumber.MinorNumber.WeeksSinceMinorNumberUpdate
|
||||
Tag format: #.#.##
|
||||
|
||||
# Prepend master_history.txt with
|
||||
|
||||
# tag: #.#.##
|
||||
# date: mm/dd/yyyy
|
||||
# master: sha1
|
||||
# develop: sha1
|
||||
# -----------------------
|
||||
|
||||
git commit --amend -a
|
||||
|
||||
git tag -a #.#.##
|
||||
tag: #.#.##
|
||||
date: mm/dd/yyyy
|
||||
master: sha1
|
||||
develop: sha1
|
||||
|
||||
4.4. Do NOT push yet
|
||||
|
||||
// -------------------------------------------------------------------------------- //
|
||||
|
||||
Step 5:
|
||||
5.1. Make sure Trilinos is up-to-date - chances are other changes have been committed since the integration testing process began. If a substantial change has occurred that may be affected by the snapshot the testing procedure may need to be repeated
|
||||
|
||||
(From Trilinos directory):
|
||||
git checkout develop
|
||||
git fetch --all
|
||||
git reset --hard origin/develop
|
||||
git clean -df
|
||||
|
||||
5.2. Snapshot Kokkos master branch into Trilinos
|
||||
|
||||
(From kokkos directory):
|
||||
git fetch --all
|
||||
git checkout tags/#.#.##
|
||||
git clean -df
|
||||
|
||||
python KOKKOS_PATH/config/snapshot.py KOKKOS_PATH TRILINOS_PATH/packages
|
||||
|
||||
5.3. Run checkin-test to push to trilinos using the CI build modules (gcc/4.9.3)
|
||||
|
||||
The modules are listed in kokkos/config/trilinos-integration/checkin-test
|
||||
Run checkin-test, forward dependencies and optional dependencies must be enabled
|
||||
If push failed because someone else clearly broke something, push manually.
|
||||
If push failed for unclear reasons, investigate, fix, and potentially start over from step 2 after reseting your local kokkos/master branch
|
||||
|
||||
Step 6: Push Kokkos to master
|
||||
|
||||
git push --follow-tags origin master
|
||||
|
||||
// -------------------------------------------------------------------------------- //
|
|
@ -1,110 +0,0 @@
|
|||
#!/bin/sh
|
||||
#
|
||||
# Copy this script, put it outside the Trilinos source directory, and
|
||||
# build there.
|
||||
#
|
||||
#-----------------------------------------------------------------------------
|
||||
# Building on 'kokkos-dev.sandia.gov' with enabled capabilities:
|
||||
#
|
||||
# Cuda, OpenMP, Threads, Qthreads, hwloc
|
||||
#
|
||||
# module loaded on 'kokkos-dev.sandia.gov' for this build
|
||||
#
|
||||
# module load cmake/2.8.11.2 gcc/4.8.3 cuda/6.5.14 nvcc-wrapper/gnu
|
||||
#
|
||||
# The 'nvcc-wrapper' module should load a script that matches
|
||||
# kokkos/bin/nvcc_wrapper
|
||||
#
|
||||
#-----------------------------------------------------------------------------
|
||||
# Source and installation directories:
|
||||
|
||||
TRILINOS_SOURCE_DIR=${HOME}/Trilinos
|
||||
TRILINOS_INSTALL_DIR=${HOME}/TrilinosInstall/`date +%F`
|
||||
|
||||
CMAKE_CONFIGURE=""
|
||||
CMAKE_CONFIGURE="${CMAKE_CONFIGURE} -D CMAKE_INSTALL_PREFIX=${TRILINOS_INSTALL_DIR}"
|
||||
|
||||
#-----------------------------------------------------------------------------
|
||||
# Debug/optimized
|
||||
|
||||
# CMAKE_CONFIGURE="${CMAKE_CONFIGURE} -D CMAKE_BUILD_TYPE:STRING=DEBUG"
|
||||
# CMAKE_CONFIGURE="${CMAKE_CONFIGURE} -D Kokkos_ENABLE_BOUNDS_CHECK:BOOL=ON"
|
||||
|
||||
CMAKE_CONFIGURE="${CMAKE_CONFIGURE} -D CMAKE_BUILD_TYPE:STRING=RELEASE"
|
||||
|
||||
#-----------------------------------------------------------------------------
|
||||
|
||||
CMAKE_CONFIGURE="${CMAKE_CONFIGURE} -D CMAKE_CXX_FLAGS:STRING=-Wall"
|
||||
CMAKE_CONFIGURE="${CMAKE_CONFIGURE} -D CMAKE_C_COMPILER=gcc"
|
||||
|
||||
#-----------------------------------------------------------------------------
|
||||
# Cuda using GNU, use the nvcc_wrapper to build CUDA source
|
||||
|
||||
# CMAKE_CONFIGURE="${CMAKE_CONFIGURE} -D CMAKE_CXX_COMPILER=g++"
|
||||
|
||||
CMAKE_CONFIGURE="${CMAKE_CONFIGURE} -D CMAKE_CXX_COMPILER=nvcc_wrapper"
|
||||
CMAKE_CONFIGURE="${CMAKE_CONFIGURE} -D TPL_ENABLE_CUDA:BOOL=ON"
|
||||
CMAKE_CONFIGURE="${CMAKE_CONFIGURE} -D TPL_ENABLE_CUSPARSE:BOOL=ON"
|
||||
|
||||
#-----------------------------------------------------------------------------
|
||||
# Configure for Kokkos subpackages and tests:
|
||||
|
||||
CMAKE_CONFIGURE="${CMAKE_CONFIGURE} -D Trilinos_ENABLE_Fortran:BOOL=OFF"
|
||||
CMAKE_CONFIGURE="${CMAKE_CONFIGURE} -D Trilinos_ENABLE_ALL_PACKAGES:BOOL=OFF"
|
||||
CMAKE_CONFIGURE="${CMAKE_CONFIGURE} -D Trilinos_ENABLE_EXAMPLES:BOOL=ON"
|
||||
CMAKE_CONFIGURE="${CMAKE_CONFIGURE} -D Trilinos_ENABLE_TESTS:BOOL=ON"
|
||||
|
||||
CMAKE_CONFIGURE="${CMAKE_CONFIGURE} -D Trilinos_ENABLE_KokkosCore:BOOL=ON"
|
||||
CMAKE_CONFIGURE="${CMAKE_CONFIGURE} -D Trilinos_ENABLE_KokkosContainers:BOOL=ON"
|
||||
CMAKE_CONFIGURE="${CMAKE_CONFIGURE} -D Trilinos_ENABLE_KokkosAlgorithms:BOOL=ON"
|
||||
CMAKE_CONFIGURE="${CMAKE_CONFIGURE} -D Trilinos_ENABLE_TpetraKernels:BOOL=ON"
|
||||
CMAKE_CONFIGURE="${CMAKE_CONFIGURE} -D Trilinos_ENABLE_KokkosExample:BOOL=ON"
|
||||
|
||||
#-----------------------------------------------------------------------------
|
||||
# Hardware locality configuration:
|
||||
|
||||
HWLOC_BASE_DIR="/home/projects/hwloc/1.7.1/host/gnu/4.7.3"
|
||||
|
||||
CMAKE_CONFIGURE="${CMAKE_CONFIGURE} -D TPL_ENABLE_HWLOC:BOOL=ON"
|
||||
CMAKE_CONFIGURE="${CMAKE_CONFIGURE} -D HWLOC_INCLUDE_DIRS:FILEPATH=${HWLOC_BASE_DIR}/include"
|
||||
CMAKE_CONFIGURE="${CMAKE_CONFIGURE} -D HWLOC_LIBRARY_DIRS:FILEPATH=${HWLOC_BASE_DIR}/lib"
|
||||
|
||||
#-----------------------------------------------------------------------------
|
||||
# Pthread
|
||||
|
||||
CMAKE_CONFIGURE="${CMAKE_CONFIGURE} -D TPL_ENABLE_Pthread:BOOL=ON"
|
||||
CMAKE_CONFIGURE="${CMAKE_CONFIGURE} -D Kokkos_ENABLE_Pthread:BOOL=ON"
|
||||
|
||||
#-----------------------------------------------------------------------------
|
||||
# OpenMP
|
||||
|
||||
CMAKE_CONFIGURE="${CMAKE_CONFIGURE} -D Trilinos_ENABLE_OpenMP:BOOL=ON"
|
||||
CMAKE_CONFIGURE="${CMAKE_CONFIGURE} -D Kokkos_ENABLE_OpenMP:BOOL=ON"
|
||||
|
||||
#-----------------------------------------------------------------------------
|
||||
# Qthreads
|
||||
|
||||
QTHREADS_BASE_DIR="/home/projects/qthreads/2014-07-08/host/gnu/4.7.3"
|
||||
|
||||
CMAKE_CONFIGURE="${CMAKE_CONFIGURE} -D TPL_ENABLE_QTHREADS:BOOL=ON"
|
||||
CMAKE_CONFIGURE="${CMAKE_CONFIGURE} -D QTHREADS_INCLUDE_DIRS:FILEPATH=${QTHREADS_BASE_DIR}/include"
|
||||
CMAKE_CONFIGURE="${CMAKE_CONFIGURE} -D QTHREADS_LIBRARY_DIRS:FILEPATH=${QTHREADS_BASE_DIR}/lib"
|
||||
|
||||
#-----------------------------------------------------------------------------
|
||||
# C++11
|
||||
|
||||
# CMAKE_CONFIGURE="${CMAKE_CONFIGURE} -D Trilinos_ENABLE_CXX11:BOOL=ON"
|
||||
# CMAKE_CONFIGURE="${CMAKE_CONFIGURE} -D Kokkos_ENABLE_CXX11:BOOL=ON"
|
||||
|
||||
#-----------------------------------------------------------------------------
|
||||
#
|
||||
# Remove CMake output files to force reconfigure from scratch.
|
||||
#
|
||||
|
||||
rm -rf CMake* Trilinos* packages Dart* Testing cmake_install.cmake MakeFile*
|
||||
|
||||
#
|
||||
|
||||
echo cmake ${CMAKE_CONFIGURE} ${TRILINOS_SOURCE_DIR}
|
||||
|
||||
cmake ${CMAKE_CONFIGURE} ${TRILINOS_SOURCE_DIR}
|
|
@ -1,104 +0,0 @@
|
|||
#!/bin/sh
|
||||
#
|
||||
# Copy this script, put it outside the Trilinos source directory, and
|
||||
# build there.
|
||||
#
|
||||
#-----------------------------------------------------------------------------
|
||||
# Building on 'kokkos-dev.sandia.gov' with enabled capabilities:
|
||||
#
|
||||
# Cuda, OpenMP, hwloc
|
||||
#
|
||||
# module loaded on 'kokkos-dev.sandia.gov' for this build
|
||||
#
|
||||
# module load cmake/2.8.11.2 gcc/4.8.3 cuda/6.5.14 nvcc-wrapper/gnu
|
||||
#
|
||||
# The 'nvcc-wrapper' module should load a script that matches
|
||||
# kokkos/bin/nvcc_wrapper
|
||||
#
|
||||
#-----------------------------------------------------------------------------
|
||||
# Source and installation directories:
|
||||
|
||||
TRILINOS_SOURCE_DIR=${HOME}/Trilinos
|
||||
TRILINOS_INSTALL_DIR=${HOME}/TrilinosInstall/`date +%F`
|
||||
|
||||
CMAKE_CONFIGURE=""
|
||||
CMAKE_CONFIGURE="${CMAKE_CONFIGURE} -D CMAKE_INSTALL_PREFIX=${TRILINOS_INSTALL_DIR}"
|
||||
|
||||
#-----------------------------------------------------------------------------
|
||||
# Debug/optimized
|
||||
|
||||
# CMAKE_CONFIGURE="${CMAKE_CONFIGURE} -D CMAKE_BUILD_TYPE:STRING=DEBUG"
|
||||
# CMAKE_CONFIGURE="${CMAKE_CONFIGURE} -D Kokkos_ENABLE_BOUNDS_CHECK:BOOL=ON"
|
||||
|
||||
CMAKE_CONFIGURE="${CMAKE_CONFIGURE} -D CMAKE_BUILD_TYPE:STRING=RELEASE"
|
||||
|
||||
#-----------------------------------------------------------------------------
|
||||
|
||||
CMAKE_CONFIGURE="${CMAKE_CONFIGURE} -D CMAKE_CXX_FLAGS:STRING=-Wall"
|
||||
CMAKE_CONFIGURE="${CMAKE_CONFIGURE} -D CMAKE_C_COMPILER=gcc"
|
||||
|
||||
#-----------------------------------------------------------------------------
|
||||
# Cuda using GNU, use the nvcc_wrapper to build CUDA source
|
||||
|
||||
# CMAKE_CONFIGURE="${CMAKE_CONFIGURE} -D CMAKE_CXX_COMPILER=g++"
|
||||
|
||||
CMAKE_CONFIGURE="${CMAKE_CONFIGURE} -D CMAKE_CXX_COMPILER=nvcc_wrapper"
|
||||
CMAKE_CONFIGURE="${CMAKE_CONFIGURE} -D TPL_ENABLE_CUDA:BOOL=ON"
|
||||
CMAKE_CONFIGURE="${CMAKE_CONFIGURE} -D TPL_ENABLE_CUSPARSE:BOOL=ON"
|
||||
|
||||
#-----------------------------------------------------------------------------
|
||||
# Configure for Kokkos subpackages and tests:
|
||||
|
||||
CMAKE_CONFIGURE="${CMAKE_CONFIGURE} -D Trilinos_ENABLE_Fortran:BOOL=OFF"
|
||||
CMAKE_CONFIGURE="${CMAKE_CONFIGURE} -D Trilinos_ENABLE_ALL_PACKAGES:BOOL=OFF"
|
||||
CMAKE_CONFIGURE="${CMAKE_CONFIGURE} -D Trilinos_ENABLE_EXAMPLES:BOOL=ON"
|
||||
CMAKE_CONFIGURE="${CMAKE_CONFIGURE} -D Trilinos_ENABLE_TESTS:BOOL=ON"
|
||||
|
||||
CMAKE_CONFIGURE="${CMAKE_CONFIGURE} -D Trilinos_ENABLE_KokkosCore:BOOL=ON"
|
||||
CMAKE_CONFIGURE="${CMAKE_CONFIGURE} -D Trilinos_ENABLE_KokkosContainers:BOOL=ON"
|
||||
CMAKE_CONFIGURE="${CMAKE_CONFIGURE} -D Trilinos_ENABLE_KokkosAlgorithms:BOOL=ON"
|
||||
CMAKE_CONFIGURE="${CMAKE_CONFIGURE} -D Trilinos_ENABLE_TpetraKernels:BOOL=ON"
|
||||
CMAKE_CONFIGURE="${CMAKE_CONFIGURE} -D Trilinos_ENABLE_KokkosExample:BOOL=ON"
|
||||
|
||||
#-----------------------------------------------------------------------------
|
||||
# Hardware locality configuration:
|
||||
|
||||
HWLOC_BASE_DIR="/home/projects/hwloc/1.7.1/host/gnu/4.7.3"
|
||||
|
||||
CMAKE_CONFIGURE="${CMAKE_CONFIGURE} -D TPL_ENABLE_HWLOC:BOOL=ON"
|
||||
CMAKE_CONFIGURE="${CMAKE_CONFIGURE} -D HWLOC_INCLUDE_DIRS:FILEPATH=${HWLOC_BASE_DIR}/include"
|
||||
CMAKE_CONFIGURE="${CMAKE_CONFIGURE} -D HWLOC_LIBRARY_DIRS:FILEPATH=${HWLOC_BASE_DIR}/lib"
|
||||
|
||||
#-----------------------------------------------------------------------------
|
||||
# Pthread explicitly OFF so tribits doesn't automatically turn it on
|
||||
|
||||
CMAKE_CONFIGURE="${CMAKE_CONFIGURE} -D TPL_ENABLE_Pthread:BOOL=OFF"
|
||||
CMAKE_CONFIGURE="${CMAKE_CONFIGURE} -D Kokkos_ENABLE_Pthread:BOOL=OFF"
|
||||
|
||||
#-----------------------------------------------------------------------------
|
||||
# OpenMP
|
||||
|
||||
CMAKE_CONFIGURE="${CMAKE_CONFIGURE} -D Trilinos_ENABLE_OpenMP:BOOL=ON"
|
||||
CMAKE_CONFIGURE="${CMAKE_CONFIGURE} -D Kokkos_ENABLE_OpenMP:BOOL=ON"
|
||||
|
||||
#-----------------------------------------------------------------------------
|
||||
# C++11
|
||||
|
||||
# CMAKE_CONFIGURE="${CMAKE_CONFIGURE} -D Trilinos_ENABLE_CXX11:BOOL=ON"
|
||||
# CMAKE_CONFIGURE="${CMAKE_CONFIGURE} -D Kokkos_ENABLE_CXX11:BOOL=ON"
|
||||
|
||||
#-----------------------------------------------------------------------------
|
||||
#
|
||||
# Remove CMake output files to force reconfigure from scratch.
|
||||
#
|
||||
|
||||
rm -rf CMake* Trilinos* packages Dart* Testing cmake_install.cmake MakeFile*
|
||||
|
||||
#
|
||||
|
||||
echo cmake ${CMAKE_CONFIGURE} ${TRILINOS_SOURCE_DIR}
|
||||
|
||||
cmake ${CMAKE_CONFIGURE} ${TRILINOS_SOURCE_DIR}
|
||||
|
||||
#-----------------------------------------------------------------------------
|
||||
|
|
@ -1,88 +0,0 @@
|
|||
#!/bin/sh
|
||||
#
|
||||
# Copy this script, put it outside the Trilinos source directory, and
|
||||
# build there.
|
||||
#
|
||||
#-----------------------------------------------------------------------------
|
||||
# Building on 'kokkos-dev.sandia.gov' with enabled capabilities:
|
||||
#
|
||||
# Cuda
|
||||
#
|
||||
# module loaded on 'kokkos-dev.sandia.gov' for this build
|
||||
#
|
||||
# module load cmake/2.8.11.2 gcc/4.8.3 cuda/6.5.14 nvcc-wrapper/gnu
|
||||
#
|
||||
# The 'nvcc-wrapper' module should load a script that matches
|
||||
# kokkos/bin/nvcc_wrapper
|
||||
#
|
||||
#-----------------------------------------------------------------------------
|
||||
# Source and installation directories:
|
||||
|
||||
TRILINOS_SOURCE_DIR=${HOME}/Trilinos
|
||||
TRILINOS_INSTALL_DIR=${HOME}/TrilinosInstall/`date +%F`
|
||||
|
||||
CMAKE_CONFIGURE=""
|
||||
CMAKE_CONFIGURE="${CMAKE_CONFIGURE} -D CMAKE_INSTALL_PREFIX=${TRILINOS_INSTALL_DIR}"
|
||||
|
||||
#-----------------------------------------------------------------------------
|
||||
# Debug/optimized
|
||||
|
||||
# CMAKE_CONFIGURE="${CMAKE_CONFIGURE} -D CMAKE_BUILD_TYPE:STRING=DEBUG"
|
||||
# CMAKE_CONFIGURE="${CMAKE_CONFIGURE} -D Kokkos_ENABLE_BOUNDS_CHECK:BOOL=ON"
|
||||
|
||||
CMAKE_CONFIGURE="${CMAKE_CONFIGURE} -D CMAKE_BUILD_TYPE:STRING=RELEASE"
|
||||
|
||||
#-----------------------------------------------------------------------------
|
||||
|
||||
CMAKE_CONFIGURE="${CMAKE_CONFIGURE} -D CMAKE_CXX_FLAGS:STRING=-Wall"
|
||||
CMAKE_CONFIGURE="${CMAKE_CONFIGURE} -D CMAKE_C_COMPILER=gcc"
|
||||
|
||||
#-----------------------------------------------------------------------------
|
||||
# Cuda using GNU, use the nvcc_wrapper to build CUDA source
|
||||
|
||||
# CMAKE_CONFIGURE="${CMAKE_CONFIGURE} -D CMAKE_CXX_COMPILER=g++"
|
||||
|
||||
CMAKE_CONFIGURE="${CMAKE_CONFIGURE} -D CMAKE_CXX_COMPILER=nvcc_wrapper"
|
||||
CMAKE_CONFIGURE="${CMAKE_CONFIGURE} -D TPL_ENABLE_CUDA:BOOL=ON"
|
||||
CMAKE_CONFIGURE="${CMAKE_CONFIGURE} -D TPL_ENABLE_CUSPARSE:BOOL=ON"
|
||||
|
||||
# Pthread explicitly OFF, otherwise tribits will automatically turn it on
|
||||
|
||||
CMAKE_CONFIGURE="${CMAKE_CONFIGURE} -D TPL_ENABLE_Pthread:BOOL=OFF"
|
||||
CMAKE_CONFIGURE="${CMAKE_CONFIGURE} -D Kokkos_ENABLE_Pthread:BOOL=OFF"
|
||||
|
||||
#-----------------------------------------------------------------------------
|
||||
# Configure for Kokkos subpackages and tests:
|
||||
|
||||
CMAKE_CONFIGURE="${CMAKE_CONFIGURE} -D Trilinos_ENABLE_Fortran:BOOL=OFF"
|
||||
CMAKE_CONFIGURE="${CMAKE_CONFIGURE} -D Trilinos_ENABLE_ALL_PACKAGES:BOOL=OFF"
|
||||
CMAKE_CONFIGURE="${CMAKE_CONFIGURE} -D Trilinos_ENABLE_EXAMPLES:BOOL=ON"
|
||||
CMAKE_CONFIGURE="${CMAKE_CONFIGURE} -D Trilinos_ENABLE_TESTS:BOOL=ON"
|
||||
|
||||
CMAKE_CONFIGURE="${CMAKE_CONFIGURE} -D Trilinos_ENABLE_KokkosCore:BOOL=ON"
|
||||
CMAKE_CONFIGURE="${CMAKE_CONFIGURE} -D Trilinos_ENABLE_KokkosContainers:BOOL=ON"
|
||||
CMAKE_CONFIGURE="${CMAKE_CONFIGURE} -D Trilinos_ENABLE_KokkosAlgorithms:BOOL=ON"
|
||||
CMAKE_CONFIGURE="${CMAKE_CONFIGURE} -D Trilinos_ENABLE_TpetraKernels:BOOL=ON"
|
||||
CMAKE_CONFIGURE="${CMAKE_CONFIGURE} -D Trilinos_ENABLE_KokkosExample:BOOL=ON"
|
||||
|
||||
#-----------------------------------------------------------------------------
|
||||
# C++11
|
||||
|
||||
# CMAKE_CONFIGURE="${CMAKE_CONFIGURE} -D Trilinos_ENABLE_CXX11:BOOL=ON"
|
||||
# CMAKE_CONFIGURE="${CMAKE_CONFIGURE} -D Kokkos_ENABLE_CXX11:BOOL=ON"
|
||||
|
||||
#-----------------------------------------------------------------------------
|
||||
#
|
||||
# Remove CMake output files to force reconfigure from scratch.
|
||||
#
|
||||
|
||||
rm -rf CMake* Trilinos* packages Dart* Testing cmake_install.cmake MakeFile*
|
||||
|
||||
#
|
||||
|
||||
echo cmake ${CMAKE_CONFIGURE} ${TRILINOS_SOURCE_DIR}
|
||||
|
||||
cmake ${CMAKE_CONFIGURE} ${TRILINOS_SOURCE_DIR}
|
||||
|
||||
#-----------------------------------------------------------------------------
|
||||
|
|
@ -1,84 +0,0 @@
|
|||
#!/bin/sh
|
||||
#
|
||||
# Copy this script, put it outside the Trilinos source directory, and
|
||||
# build there.
|
||||
#
|
||||
#-----------------------------------------------------------------------------
|
||||
# Building on 'kokkos-dev.sandia.gov' with enabled capabilities:
|
||||
#
|
||||
# C++11, OpenMP
|
||||
#
|
||||
# module loaded on 'kokkos-dev.sandia.gov' for this build
|
||||
#
|
||||
# module load cmake/2.8.11.2 gcc/4.8.3
|
||||
#
|
||||
#-----------------------------------------------------------------------------
|
||||
# Source and installation directories:
|
||||
|
||||
TRILINOS_SOURCE_DIR=${HOME}/Trilinos
|
||||
TRILINOS_INSTALL_DIR=${HOME}/TrilinosInstall/`date +%F`
|
||||
|
||||
CMAKE_CONFIGURE=""
|
||||
CMAKE_CONFIGURE="${CMAKE_CONFIGURE} -D CMAKE_INSTALL_PREFIX=${TRILINOS_INSTALL_DIR}"
|
||||
|
||||
#-----------------------------------------------------------------------------
|
||||
# Debug/optimized
|
||||
|
||||
# CMAKE_CONFIGURE="${CMAKE_CONFIGURE} -D CMAKE_BUILD_TYPE:STRING=DEBUG"
|
||||
# CMAKE_CONFIGURE="${CMAKE_CONFIGURE} -D Kokkos_ENABLE_BOUNDS_CHECK:BOOL=ON"
|
||||
|
||||
CMAKE_CONFIGURE="${CMAKE_CONFIGURE} -D CMAKE_BUILD_TYPE:STRING=RELEASE"
|
||||
|
||||
#-----------------------------------------------------------------------------
|
||||
|
||||
CMAKE_CONFIGURE="${CMAKE_CONFIGURE} -D CMAKE_CXX_FLAGS:STRING=-Wall"
|
||||
CMAKE_CONFIGURE="${CMAKE_CONFIGURE} -D CMAKE_C_COMPILER=gcc"
|
||||
CMAKE_CONFIGURE="${CMAKE_CONFIGURE} -D CMAKE_CXX_COMPILER=g++"
|
||||
|
||||
#-----------------------------------------------------------------------------
|
||||
# Configure for Kokkos subpackages and tests:
|
||||
|
||||
CMAKE_CONFIGURE="${CMAKE_CONFIGURE} -D Trilinos_ENABLE_Fortran:BOOL=OFF"
|
||||
CMAKE_CONFIGURE="${CMAKE_CONFIGURE} -D Trilinos_ENABLE_ALL_PACKAGES:BOOL=OFF"
|
||||
CMAKE_CONFIGURE="${CMAKE_CONFIGURE} -D Trilinos_ENABLE_EXAMPLES:BOOL=ON"
|
||||
CMAKE_CONFIGURE="${CMAKE_CONFIGURE} -D Trilinos_ENABLE_TESTS:BOOL=ON"
|
||||
|
||||
CMAKE_CONFIGURE="${CMAKE_CONFIGURE} -D Trilinos_ENABLE_KokkosCore:BOOL=ON"
|
||||
CMAKE_CONFIGURE="${CMAKE_CONFIGURE} -D Trilinos_ENABLE_KokkosContainers:BOOL=ON"
|
||||
CMAKE_CONFIGURE="${CMAKE_CONFIGURE} -D Trilinos_ENABLE_KokkosAlgorithms:BOOL=ON"
|
||||
CMAKE_CONFIGURE="${CMAKE_CONFIGURE} -D Trilinos_ENABLE_TpetraKernels:BOOL=ON"
|
||||
CMAKE_CONFIGURE="${CMAKE_CONFIGURE} -D Trilinos_ENABLE_KokkosExample:BOOL=ON"
|
||||
|
||||
#-----------------------------------------------------------------------------
|
||||
# Pthread explicitly OFF so tribits doesn't automatically activate
|
||||
|
||||
CMAKE_CONFIGURE="${CMAKE_CONFIGURE} -D TPL_ENABLE_Pthread:BOOL=OFF"
|
||||
CMAKE_CONFIGURE="${CMAKE_CONFIGURE} -D Kokkos_ENABLE_Pthread:BOOL=OFF"
|
||||
|
||||
#-----------------------------------------------------------------------------
|
||||
# OpenMP
|
||||
|
||||
CMAKE_CONFIGURE="${CMAKE_CONFIGURE} -D Trilinos_ENABLE_OpenMP:BOOL=ON"
|
||||
CMAKE_CONFIGURE="${CMAKE_CONFIGURE} -D Kokkos_ENABLE_OpenMP:BOOL=ON"
|
||||
|
||||
#-----------------------------------------------------------------------------
|
||||
# C++11
|
||||
|
||||
CMAKE_CONFIGURE="${CMAKE_CONFIGURE} -D Trilinos_ENABLE_CXX11:BOOL=ON"
|
||||
CMAKE_CONFIGURE="${CMAKE_CONFIGURE} -D Kokkos_ENABLE_CXX11:BOOL=ON"
|
||||
|
||||
#-----------------------------------------------------------------------------
|
||||
#
|
||||
# Remove CMake output files to force reconfigure from scratch.
|
||||
#
|
||||
|
||||
rm -rf CMake* Trilinos* packages Dart* Testing cmake_install.cmake MakeFile*
|
||||
|
||||
#
|
||||
|
||||
echo cmake ${CMAKE_CONFIGURE} ${TRILINOS_SOURCE_DIR}
|
||||
|
||||
cmake ${CMAKE_CONFIGURE} ${TRILINOS_SOURCE_DIR}
|
||||
|
||||
#-----------------------------------------------------------------------------
|
||||
|
|
@ -1,78 +0,0 @@
|
|||
#!/bin/sh
|
||||
#
|
||||
# Copy this script, put it outside the Trilinos source directory, and
|
||||
# build there.
|
||||
#
|
||||
#-----------------------------------------------------------------------------
|
||||
# Building on 'kokkos-dev.sandia.gov' with enabled capabilities:
|
||||
#
|
||||
# <none>
|
||||
#
|
||||
# module loaded on 'kokkos-dev.sandia.gov' for this build
|
||||
#
|
||||
# module load cmake/2.8.11.2 gcc/4.8.3
|
||||
#
|
||||
#-----------------------------------------------------------------------------
|
||||
# Source and installation directories:
|
||||
|
||||
TRILINOS_SOURCE_DIR=${HOME}/Trilinos
|
||||
TRILINOS_INSTALL_DIR=${HOME}/TrilinosInstall/`date +%F`
|
||||
|
||||
CMAKE_CONFIGURE=""
|
||||
CMAKE_CONFIGURE="${CMAKE_CONFIGURE} -D CMAKE_INSTALL_PREFIX=${TRILINOS_INSTALL_DIR}"
|
||||
|
||||
#-----------------------------------------------------------------------------
|
||||
# Debug/optimized
|
||||
|
||||
CMAKE_CONFIGURE="${CMAKE_CONFIGURE} -D CMAKE_BUILD_TYPE:STRING=DEBUG"
|
||||
CMAKE_CONFIGURE="${CMAKE_CONFIGURE} -D Kokkos_ENABLE_BOUNDS_CHECK:BOOL=ON"
|
||||
|
||||
# CMAKE_CONFIGURE="${CMAKE_CONFIGURE} -D CMAKE_BUILD_TYPE:STRING=RELEASE"
|
||||
|
||||
#-----------------------------------------------------------------------------
|
||||
|
||||
CMAKE_CONFIGURE="${CMAKE_CONFIGURE} -D CMAKE_CXX_FLAGS:STRING=-Wall"
|
||||
CMAKE_CONFIGURE="${CMAKE_CONFIGURE} -D CMAKE_C_COMPILER=gcc"
|
||||
CMAKE_CONFIGURE="${CMAKE_CONFIGURE} -D CMAKE_CXX_COMPILER=g++"
|
||||
|
||||
#-----------------------------------------------------------------------------
|
||||
# Configure for Kokkos subpackages and tests:
|
||||
|
||||
CMAKE_CONFIGURE="${CMAKE_CONFIGURE} -D Trilinos_ENABLE_Fortran:BOOL=OFF"
|
||||
CMAKE_CONFIGURE="${CMAKE_CONFIGURE} -D Trilinos_ENABLE_ALL_PACKAGES:BOOL=OFF"
|
||||
CMAKE_CONFIGURE="${CMAKE_CONFIGURE} -D Trilinos_ENABLE_EXAMPLES:BOOL=ON"
|
||||
CMAKE_CONFIGURE="${CMAKE_CONFIGURE} -D Trilinos_ENABLE_TESTS:BOOL=ON"
|
||||
|
||||
CMAKE_CONFIGURE="${CMAKE_CONFIGURE} -D Trilinos_ENABLE_KokkosCore:BOOL=ON"
|
||||
CMAKE_CONFIGURE="${CMAKE_CONFIGURE} -D Trilinos_ENABLE_KokkosContainers:BOOL=ON"
|
||||
CMAKE_CONFIGURE="${CMAKE_CONFIGURE} -D Trilinos_ENABLE_KokkosAlgorithms:BOOL=ON"
|
||||
CMAKE_CONFIGURE="${CMAKE_CONFIGURE} -D Trilinos_ENABLE_TpetraKernels:BOOL=ON"
|
||||
CMAKE_CONFIGURE="${CMAKE_CONFIGURE} -D Trilinos_ENABLE_KokkosExample:BOOL=ON"
|
||||
|
||||
#-----------------------------------------------------------------------------
|
||||
# Kokkos Pthread explicitly OFF, TPL Pthread ON for gtest
|
||||
|
||||
CMAKE_CONFIGURE="${CMAKE_CONFIGURE} -D TPL_ENABLE_Pthread:BOOL=ON"
|
||||
CMAKE_CONFIGURE="${CMAKE_CONFIGURE} -D Kokkos_ENABLE_Pthread:BOOL=OFF"
|
||||
|
||||
#-----------------------------------------------------------------------------
|
||||
# C++11
|
||||
|
||||
# CMAKE_CONFIGURE="${CMAKE_CONFIGURE} -D Trilinos_ENABLE_CXX11:BOOL=ON"
|
||||
# CMAKE_CONFIGURE="${CMAKE_CONFIGURE} -D Kokkos_ENABLE_CXX11:BOOL=ON"
|
||||
|
||||
#-----------------------------------------------------------------------------
|
||||
#
|
||||
# Remove CMake output files to force reconfigure from scratch.
|
||||
#
|
||||
|
||||
rm -rf CMake* Trilinos* packages Dart* Testing cmake_install.cmake MakeFile*
|
||||
|
||||
#
|
||||
|
||||
echo cmake ${CMAKE_CONFIGURE} ${TRILINOS_SOURCE_DIR}
|
||||
|
||||
cmake ${CMAKE_CONFIGURE} ${TRILINOS_SOURCE_DIR}
|
||||
|
||||
#-----------------------------------------------------------------------------
|
||||
|
|
@ -1,89 +0,0 @@
|
|||
#!/bin/sh
|
||||
#
|
||||
# Copy this script, put it outside the Trilinos source directory, and
|
||||
# build there.
|
||||
#
|
||||
#-----------------------------------------------------------------------------
|
||||
# Building on 'kokkos-dev.sandia.gov' with enabled capabilities:
|
||||
#
|
||||
# Intel, OpenMP, Cuda
|
||||
#
|
||||
# module loaded on 'kokkos-dev.sandia.gov' for this build
|
||||
#
|
||||
# module load cmake/2.8.11.2 cuda/7.0.4 intel/2015.0.090 nvcc-wrapper/intel
|
||||
#
|
||||
#-----------------------------------------------------------------------------
|
||||
# Source and installation directories:
|
||||
|
||||
TRILINOS_SOURCE_DIR=${HOME}/Trilinos
|
||||
TRILINOS_INSTALL_DIR=${HOME}/TrilinosInstall/`date +%F`
|
||||
|
||||
CMAKE_CONFIGURE=""
|
||||
CMAKE_CONFIGURE="${CMAKE_CONFIGURE} -D CMAKE_INSTALL_PREFIX=${TRILINOS_INSTALL_DIR}"
|
||||
|
||||
#-----------------------------------------------------------------------------
|
||||
# Debug/optimized
|
||||
|
||||
# CMAKE_CONFIGURE="${CMAKE_CONFIGURE} -D CMAKE_BUILD_TYPE:STRING=DEBUG"
|
||||
# CMAKE_CONFIGURE="${CMAKE_CONFIGURE} -D Kokkos_ENABLE_BOUNDS_CHECK:BOOL=ON"
|
||||
|
||||
CMAKE_CONFIGURE="${CMAKE_CONFIGURE} -D CMAKE_BUILD_TYPE:STRING=RELEASE"
|
||||
|
||||
#-----------------------------------------------------------------------------
|
||||
|
||||
CMAKE_CONFIGURE="${CMAKE_CONFIGURE} -D CMAKE_CXX_FLAGS:STRING=-Wall"
|
||||
CMAKE_CONFIGURE="${CMAKE_CONFIGURE} -D CMAKE_C_COMPILER=icc"
|
||||
|
||||
# CMAKE_CONFIGURE="${CMAKE_CONFIGURE} -D CMAKE_CXX_COMPILER=icpc"
|
||||
|
||||
CMAKE_CONFIGURE="${CMAKE_CONFIGURE} -D CMAKE_CXX_COMPILER=nvcc_wrapper"
|
||||
CMAKE_CONFIGURE="${CMAKE_CONFIGURE} -D TPL_ENABLE_CUDA:BOOL=ON"
|
||||
CMAKE_CONFIGURE="${CMAKE_CONFIGURE} -D TPL_ENABLE_CUSPARSE:BOOL=ON"
|
||||
|
||||
#-----------------------------------------------------------------------------
|
||||
# Configure for Kokkos subpackages and tests:
|
||||
|
||||
CMAKE_CONFIGURE="${CMAKE_CONFIGURE} -D Trilinos_ENABLE_Fortran:BOOL=OFF"
|
||||
CMAKE_CONFIGURE="${CMAKE_CONFIGURE} -D Trilinos_ENABLE_ALL_PACKAGES:BOOL=OFF"
|
||||
CMAKE_CONFIGURE="${CMAKE_CONFIGURE} -D Trilinos_ENABLE_EXAMPLES:BOOL=ON"
|
||||
CMAKE_CONFIGURE="${CMAKE_CONFIGURE} -D Trilinos_ENABLE_TESTS:BOOL=ON"
|
||||
|
||||
CMAKE_CONFIGURE="${CMAKE_CONFIGURE} -D Trilinos_ENABLE_KokkosCore:BOOL=ON"
|
||||
CMAKE_CONFIGURE="${CMAKE_CONFIGURE} -D Trilinos_ENABLE_KokkosContainers:BOOL=ON"
|
||||
CMAKE_CONFIGURE="${CMAKE_CONFIGURE} -D Trilinos_ENABLE_KokkosAlgorithms:BOOL=ON"
|
||||
CMAKE_CONFIGURE="${CMAKE_CONFIGURE} -D Trilinos_ENABLE_TpetraKernels:BOOL=ON"
|
||||
CMAKE_CONFIGURE="${CMAKE_CONFIGURE} -D Trilinos_ENABLE_KokkosExample:BOOL=ON"
|
||||
|
||||
#-----------------------------------------------------------------------------
|
||||
# Pthread explicitly OFF
|
||||
|
||||
CMAKE_CONFIGURE="${CMAKE_CONFIGURE} -D TPL_ENABLE_Pthread:BOOL=OFF"
|
||||
CMAKE_CONFIGURE="${CMAKE_CONFIGURE} -D Kokkos_ENABLE_Pthread:BOOL=OFF"
|
||||
|
||||
#-----------------------------------------------------------------------------
|
||||
# OpenMP
|
||||
|
||||
CMAKE_CONFIGURE="${CMAKE_CONFIGURE} -D Trilinos_ENABLE_OpenMP:BOOL=ON"
|
||||
CMAKE_CONFIGURE="${CMAKE_CONFIGURE} -D Kokkos_ENABLE_OpenMP:BOOL=ON"
|
||||
|
||||
#-----------------------------------------------------------------------------
|
||||
# C++11
|
||||
|
||||
# CMAKE_CONFIGURE="${CMAKE_CONFIGURE} -D Trilinos_ENABLE_CXX11:BOOL=ON"
|
||||
# CMAKE_CONFIGURE="${CMAKE_CONFIGURE} -D Kokkos_ENABLE_CXX11:BOOL=ON"
|
||||
|
||||
#-----------------------------------------------------------------------------
|
||||
#
|
||||
# Remove CMake output files to force reconfigure from scratch.
|
||||
#
|
||||
|
||||
rm -rf CMake* Trilinos* packages Dart* Testing cmake_install.cmake MakeFile*
|
||||
|
||||
#
|
||||
|
||||
echo cmake ${CMAKE_CONFIGURE} ${TRILINOS_SOURCE_DIR}
|
||||
|
||||
cmake ${CMAKE_CONFIGURE} ${TRILINOS_SOURCE_DIR}
|
||||
|
||||
#-----------------------------------------------------------------------------
|
||||
|
|
@ -1,84 +0,0 @@
|
|||
#!/bin/sh
|
||||
#
|
||||
# Copy this script, put it outside the Trilinos source directory, and
|
||||
# build there.
|
||||
#
|
||||
#-----------------------------------------------------------------------------
|
||||
# Building on 'kokkos-dev.sandia.gov' with enabled capabilities:
|
||||
#
|
||||
# Intel, OpenMP
|
||||
#
|
||||
# module loaded on 'kokkos-dev.sandia.gov' for this build
|
||||
#
|
||||
# module load cmake/2.8.11.2 intel/13.SP1.1.106
|
||||
#
|
||||
#-----------------------------------------------------------------------------
|
||||
# Source and installation directories:
|
||||
|
||||
TRILINOS_SOURCE_DIR=${HOME}/Trilinos
|
||||
TRILINOS_INSTALL_DIR=${HOME}/TrilinosInstall/`date +%F`
|
||||
|
||||
CMAKE_CONFIGURE=""
|
||||
CMAKE_CONFIGURE="${CMAKE_CONFIGURE} -D CMAKE_INSTALL_PREFIX=${TRILINOS_INSTALL_DIR}"
|
||||
|
||||
#-----------------------------------------------------------------------------
|
||||
# Debug/optimized
|
||||
|
||||
# CMAKE_CONFIGURE="${CMAKE_CONFIGURE} -D CMAKE_BUILD_TYPE:STRING=DEBUG"
|
||||
# CMAKE_CONFIGURE="${CMAKE_CONFIGURE} -D Kokkos_ENABLE_BOUNDS_CHECK:BOOL=ON"
|
||||
|
||||
CMAKE_CONFIGURE="${CMAKE_CONFIGURE} -D CMAKE_BUILD_TYPE:STRING=RELEASE"
|
||||
|
||||
#-----------------------------------------------------------------------------
|
||||
|
||||
CMAKE_CONFIGURE="${CMAKE_CONFIGURE} -D CMAKE_CXX_FLAGS:STRING=-Wall"
|
||||
CMAKE_CONFIGURE="${CMAKE_CONFIGURE} -D CMAKE_C_COMPILER=icc"
|
||||
CMAKE_CONFIGURE="${CMAKE_CONFIGURE} -D CMAKE_CXX_COMPILER=icpc"
|
||||
|
||||
#-----------------------------------------------------------------------------
|
||||
# Configure for Kokkos subpackages and tests:
|
||||
|
||||
CMAKE_CONFIGURE="${CMAKE_CONFIGURE} -D Trilinos_ENABLE_Fortran:BOOL=OFF"
|
||||
CMAKE_CONFIGURE="${CMAKE_CONFIGURE} -D Trilinos_ENABLE_ALL_PACKAGES:BOOL=OFF"
|
||||
CMAKE_CONFIGURE="${CMAKE_CONFIGURE} -D Trilinos_ENABLE_EXAMPLES:BOOL=ON"
|
||||
CMAKE_CONFIGURE="${CMAKE_CONFIGURE} -D Trilinos_ENABLE_TESTS:BOOL=ON"
|
||||
|
||||
CMAKE_CONFIGURE="${CMAKE_CONFIGURE} -D Trilinos_ENABLE_KokkosCore:BOOL=ON"
|
||||
CMAKE_CONFIGURE="${CMAKE_CONFIGURE} -D Trilinos_ENABLE_KokkosContainers:BOOL=ON"
|
||||
CMAKE_CONFIGURE="${CMAKE_CONFIGURE} -D Trilinos_ENABLE_KokkosAlgorithms:BOOL=ON"
|
||||
CMAKE_CONFIGURE="${CMAKE_CONFIGURE} -D Trilinos_ENABLE_TpetraKernels:BOOL=ON"
|
||||
CMAKE_CONFIGURE="${CMAKE_CONFIGURE} -D Trilinos_ENABLE_KokkosExample:BOOL=ON"
|
||||
|
||||
#-----------------------------------------------------------------------------
|
||||
# Pthread explicitly OFF
|
||||
|
||||
CMAKE_CONFIGURE="${CMAKE_CONFIGURE} -D TPL_ENABLE_Pthread:BOOL=OFF"
|
||||
CMAKE_CONFIGURE="${CMAKE_CONFIGURE} -D Kokkos_ENABLE_Pthread:BOOL=OFF"
|
||||
|
||||
#-----------------------------------------------------------------------------
|
||||
# OpenMP
|
||||
|
||||
CMAKE_CONFIGURE="${CMAKE_CONFIGURE} -D Trilinos_ENABLE_OpenMP:BOOL=ON"
|
||||
CMAKE_CONFIGURE="${CMAKE_CONFIGURE} -D Kokkos_ENABLE_OpenMP:BOOL=ON"
|
||||
|
||||
#-----------------------------------------------------------------------------
|
||||
# C++11
|
||||
|
||||
# CMAKE_CONFIGURE="${CMAKE_CONFIGURE} -D Trilinos_ENABLE_CXX11:BOOL=ON"
|
||||
# CMAKE_CONFIGURE="${CMAKE_CONFIGURE} -D Kokkos_ENABLE_CXX11:BOOL=ON"
|
||||
|
||||
#-----------------------------------------------------------------------------
|
||||
#
|
||||
# Remove CMake output files to force reconfigure from scratch.
|
||||
#
|
||||
|
||||
rm -rf CMake* Trilinos* packages Dart* Testing cmake_install.cmake MakeFile*
|
||||
|
||||
#
|
||||
|
||||
echo cmake ${CMAKE_CONFIGURE} ${TRILINOS_SOURCE_DIR}
|
||||
|
||||
cmake ${CMAKE_CONFIGURE} ${TRILINOS_SOURCE_DIR}
|
||||
|
||||
#-----------------------------------------------------------------------------
|
||||
|
|
@ -1,77 +0,0 @@
|
|||
#!/bin/sh
|
||||
#
|
||||
# Copy this script, put it outside the Trilinos source directory, and
|
||||
# build there.
|
||||
#
|
||||
#-----------------------------------------------------------------------------
|
||||
# Building on 'kokkos-dev.sandia.gov' with enabled capabilities:
|
||||
#
|
||||
# OpenMP
|
||||
#
|
||||
# module loaded on 'kokkos-dev.sandia.gov' for this build
|
||||
#
|
||||
# module load cmake/2.8.11.2 gcc/4.8.3
|
||||
#
|
||||
#-----------------------------------------------------------------------------
|
||||
# Source and installation directories:
|
||||
|
||||
TRILINOS_SOURCE_DIR=${HOME}/Trilinos
|
||||
TRILINOS_INSTALL_DIR=${HOME}/TrilinosInstall/`date +%F`
|
||||
|
||||
CMAKE_CONFIGURE=""
|
||||
CMAKE_CONFIGURE="${CMAKE_CONFIGURE} -D CMAKE_INSTALL_PREFIX=${TRILINOS_INSTALL_DIR}"
|
||||
|
||||
#-----------------------------------------------------------------------------
|
||||
# Debug/optimized
|
||||
|
||||
# CMAKE_CONFIGURE="${CMAKE_CONFIGURE} -D CMAKE_BUILD_TYPE:STRING=DEBUG"
|
||||
# CMAKE_CONFIGURE="${CMAKE_CONFIGURE} -D Kokkos_ENABLE_BOUNDS_CHECK:BOOL=ON"
|
||||
|
||||
CMAKE_CONFIGURE="${CMAKE_CONFIGURE} -D CMAKE_BUILD_TYPE:STRING=RELEASE"
|
||||
|
||||
#-----------------------------------------------------------------------------
|
||||
|
||||
CMAKE_CONFIGURE="${CMAKE_CONFIGURE} -D CMAKE_CXX_FLAGS:STRING=-Wall"
|
||||
CMAKE_CONFIGURE="${CMAKE_CONFIGURE} -D CMAKE_C_COMPILER=gcc"
|
||||
CMAKE_CONFIGURE="${CMAKE_CONFIGURE} -D CMAKE_CXX_COMPILER=g++"
|
||||
|
||||
#-----------------------------------------------------------------------------
|
||||
# Configure for Kokkos subpackages and tests:
|
||||
|
||||
CMAKE_CONFIGURE="${CMAKE_CONFIGURE} -D Trilinos_ENABLE_Fortran:BOOL=OFF"
|
||||
CMAKE_CONFIGURE="${CMAKE_CONFIGURE} -D Trilinos_ENABLE_ALL_PACKAGES:BOOL=OFF"
|
||||
CMAKE_CONFIGURE="${CMAKE_CONFIGURE} -D Trilinos_ENABLE_EXAMPLES:BOOL=ON"
|
||||
CMAKE_CONFIGURE="${CMAKE_CONFIGURE} -D Trilinos_ENABLE_TESTS:BOOL=ON"
|
||||
|
||||
CMAKE_CONFIGURE="${CMAKE_CONFIGURE} -D Trilinos_ENABLE_KokkosCore:BOOL=ON"
|
||||
CMAKE_CONFIGURE="${CMAKE_CONFIGURE} -D Trilinos_ENABLE_KokkosContainers:BOOL=ON"
|
||||
CMAKE_CONFIGURE="${CMAKE_CONFIGURE} -D Trilinos_ENABLE_KokkosAlgorithms:BOOL=ON"
|
||||
CMAKE_CONFIGURE="${CMAKE_CONFIGURE} -D Trilinos_ENABLE_TpetraKernels:BOOL=ON"
|
||||
CMAKE_CONFIGURE="${CMAKE_CONFIGURE} -D Trilinos_ENABLE_KokkosExample:BOOL=ON"
|
||||
|
||||
#-----------------------------------------------------------------------------
|
||||
# OpenMP
|
||||
|
||||
CMAKE_CONFIGURE="${CMAKE_CONFIGURE} -D Trilinos_ENABLE_OpenMP:BOOL=ON"
|
||||
CMAKE_CONFIGURE="${CMAKE_CONFIGURE} -D Kokkos_ENABLE_OpenMP:BOOL=ON"
|
||||
|
||||
# Pthread explicitly OFF, otherwise tribits will automatically turn it on
|
||||
|
||||
CMAKE_CONFIGURE="${CMAKE_CONFIGURE} -D TPL_ENABLE_Pthread:BOOL=OFF"
|
||||
CMAKE_CONFIGURE="${CMAKE_CONFIGURE} -D Kokkos_ENABLE_Pthread:BOOL=OFF"
|
||||
|
||||
#-----------------------------------------------------------------------------
|
||||
#
|
||||
# Remove CMake output files to force reconfigure from scratch.
|
||||
#
|
||||
|
||||
rm -rf CMake* Trilinos* packages Dart* Testing cmake_install.cmake MakeFile*
|
||||
|
||||
#
|
||||
|
||||
echo cmake ${CMAKE_CONFIGURE} ${TRILINOS_SOURCE_DIR}
|
||||
|
||||
cmake ${CMAKE_CONFIGURE} ${TRILINOS_SOURCE_DIR}
|
||||
|
||||
#-----------------------------------------------------------------------------
|
||||
|
|
@ -1,87 +0,0 @@
|
|||
#!/bin/sh
|
||||
#
|
||||
# Copy this script, put it outside the Trilinos source directory, and
|
||||
# build there.
|
||||
#
|
||||
#-----------------------------------------------------------------------------
|
||||
# Building on 'kokkos-dev.sandia.gov' with enabled capabilities:
|
||||
#
|
||||
# Threads, hwloc
|
||||
#
|
||||
# module loaded on 'kokkos-dev.sandia.gov' for this build
|
||||
#
|
||||
# module load cmake/2.8.11.2 gcc/4.8.3
|
||||
#
|
||||
#-----------------------------------------------------------------------------
|
||||
# Source and installation directories:
|
||||
|
||||
TRILINOS_SOURCE_DIR=${HOME}/Trilinos
|
||||
TRILINOS_INSTALL_DIR=${HOME}/TrilinosInstall/`date +%F`
|
||||
|
||||
CMAKE_CONFIGURE=""
|
||||
CMAKE_CONFIGURE="${CMAKE_CONFIGURE} -D CMAKE_INSTALL_PREFIX=${TRILINOS_INSTALL_DIR}"
|
||||
|
||||
#-----------------------------------------------------------------------------
|
||||
# Debug/optimized
|
||||
|
||||
# CMAKE_CONFIGURE="${CMAKE_CONFIGURE} -D CMAKE_BUILD_TYPE:STRING=DEBUG"
|
||||
# CMAKE_CONFIGURE="${CMAKE_CONFIGURE} -D Kokkos_ENABLE_BOUNDS_CHECK:BOOL=ON"
|
||||
|
||||
CMAKE_CONFIGURE="${CMAKE_CONFIGURE} -D CMAKE_BUILD_TYPE:STRING=RELEASE"
|
||||
|
||||
#-----------------------------------------------------------------------------
|
||||
|
||||
CMAKE_CONFIGURE="${CMAKE_CONFIGURE} -D CMAKE_CXX_FLAGS:STRING=-Wall"
|
||||
CMAKE_CONFIGURE="${CMAKE_CONFIGURE} -D CMAKE_C_COMPILER=gcc"
|
||||
CMAKE_CONFIGURE="${CMAKE_CONFIGURE} -D CMAKE_CXX_COMPILER=g++"
|
||||
|
||||
#-----------------------------------------------------------------------------
|
||||
# Configure for Kokkos subpackages and tests:
|
||||
|
||||
CMAKE_CONFIGURE="${CMAKE_CONFIGURE} -D Trilinos_ENABLE_Fortran:BOOL=OFF"
|
||||
CMAKE_CONFIGURE="${CMAKE_CONFIGURE} -D Trilinos_ENABLE_ALL_PACKAGES:BOOL=OFF"
|
||||
CMAKE_CONFIGURE="${CMAKE_CONFIGURE} -D Trilinos_ENABLE_EXAMPLES:BOOL=ON"
|
||||
CMAKE_CONFIGURE="${CMAKE_CONFIGURE} -D Trilinos_ENABLE_TESTS:BOOL=ON"
|
||||
|
||||
CMAKE_CONFIGURE="${CMAKE_CONFIGURE} -D Trilinos_ENABLE_KokkosCore:BOOL=ON"
|
||||
CMAKE_CONFIGURE="${CMAKE_CONFIGURE} -D Trilinos_ENABLE_KokkosContainers:BOOL=ON"
|
||||
CMAKE_CONFIGURE="${CMAKE_CONFIGURE} -D Trilinos_ENABLE_KokkosAlgorithms:BOOL=ON"
|
||||
CMAKE_CONFIGURE="${CMAKE_CONFIGURE} -D Trilinos_ENABLE_TpetraKernels:BOOL=ON"
|
||||
CMAKE_CONFIGURE="${CMAKE_CONFIGURE} -D Trilinos_ENABLE_KokkosExample:BOOL=ON"
|
||||
|
||||
#-----------------------------------------------------------------------------
|
||||
# Hardware locality configuration:
|
||||
|
||||
HWLOC_BASE_DIR="/home/projects/hwloc/1.7.1/host/gnu/4.7.3"
|
||||
|
||||
CMAKE_CONFIGURE="${CMAKE_CONFIGURE} -D TPL_ENABLE_HWLOC:BOOL=ON"
|
||||
CMAKE_CONFIGURE="${CMAKE_CONFIGURE} -D HWLOC_INCLUDE_DIRS:FILEPATH=${HWLOC_BASE_DIR}/include"
|
||||
CMAKE_CONFIGURE="${CMAKE_CONFIGURE} -D HWLOC_LIBRARY_DIRS:FILEPATH=${HWLOC_BASE_DIR}/lib"
|
||||
|
||||
#-----------------------------------------------------------------------------
|
||||
# Pthread
|
||||
|
||||
CMAKE_CONFIGURE="${CMAKE_CONFIGURE} -D TPL_ENABLE_Pthread:BOOL=ON"
|
||||
CMAKE_CONFIGURE="${CMAKE_CONFIGURE} -D Kokkos_ENABLE_Pthread:BOOL=ON"
|
||||
|
||||
#-----------------------------------------------------------------------------
|
||||
# C++11
|
||||
|
||||
# CMAKE_CONFIGURE="${CMAKE_CONFIGURE} -D Trilinos_ENABLE_CXX11:BOOL=ON"
|
||||
# CMAKE_CONFIGURE="${CMAKE_CONFIGURE} -D Kokkos_ENABLE_CXX11:BOOL=ON"
|
||||
|
||||
#-----------------------------------------------------------------------------
|
||||
#
|
||||
# Remove CMake output files to force reconfigure from scratch.
|
||||
#
|
||||
|
||||
rm -rf CMake* Trilinos* packages Dart* Testing cmake_install.cmake MakeFile*
|
||||
|
||||
#
|
||||
|
||||
echo cmake ${CMAKE_CONFIGURE} ${TRILINOS_SOURCE_DIR}
|
||||
|
||||
cmake ${CMAKE_CONFIGURE} ${TRILINOS_SOURCE_DIR}
|
||||
|
||||
#-----------------------------------------------------------------------------
|
||||
|
|
@ -1,340 +0,0 @@
|
|||
#!/bin/bash
|
||||
#
|
||||
# This shell script (nvcc_wrapper) wraps both the host compiler and
|
||||
# NVCC, if you are building legacy C or C++ code with CUDA enabled.
|
||||
# The script remedies some differences between the interface of NVCC
|
||||
# and that of the host compiler, in particular for linking.
|
||||
# It also means that a legacy code doesn't need separate .cu files;
|
||||
# it can just use .cpp files.
|
||||
#
|
||||
# Default settings: change those according to your machine. For
|
||||
# example, you may have have two different wrappers with either icpc
|
||||
# or g++ as their back-end compiler. The defaults can be overwritten
|
||||
# by using the usual arguments (e.g., -arch=sm_30 -ccbin icpc).
|
||||
|
||||
default_arch="sm_35"
|
||||
#default_arch="sm_50"
|
||||
|
||||
#
|
||||
# The default C++ compiler.
|
||||
#
|
||||
host_compiler=${NVCC_WRAPPER_DEFAULT_COMPILER:-"g++"}
|
||||
#host_compiler="icpc"
|
||||
#host_compiler="/usr/local/gcc/4.8.3/bin/g++"
|
||||
#host_compiler="/usr/local/gcc/4.9.1/bin/g++"
|
||||
|
||||
#
|
||||
# Internal variables
|
||||
#
|
||||
|
||||
# C++ files
|
||||
cpp_files=""
|
||||
|
||||
# Host compiler arguments
|
||||
xcompiler_args=""
|
||||
|
||||
# Cuda (NVCC) only arguments
|
||||
cuda_args=""
|
||||
|
||||
# Arguments for both NVCC and Host compiler
|
||||
shared_args=""
|
||||
|
||||
# Argument -c
|
||||
compile_arg=""
|
||||
|
||||
# Argument -o <obj>
|
||||
output_arg=""
|
||||
|
||||
# Linker arguments
|
||||
xlinker_args=""
|
||||
|
||||
# Object files passable to NVCC
|
||||
object_files=""
|
||||
|
||||
# Link objects for the host linker only
|
||||
object_files_xlinker=""
|
||||
|
||||
# Shared libraries with version numbers are not handled correctly by NVCC
|
||||
shared_versioned_libraries_host=""
|
||||
shared_versioned_libraries=""
|
||||
|
||||
# Does the User set the architecture
|
||||
arch_set=0
|
||||
|
||||
# Does the user overwrite the host compiler
|
||||
ccbin_set=0
|
||||
|
||||
#Error code of compilation
|
||||
error_code=0
|
||||
|
||||
# Do a dry run without actually compiling
|
||||
dry_run=0
|
||||
|
||||
# Skip NVCC compilation and use host compiler directly
|
||||
host_only=0
|
||||
host_only_args=""
|
||||
|
||||
# Enable workaround for CUDA 6.5 for pragma ident
|
||||
replace_pragma_ident=0
|
||||
|
||||
# Mark first host compiler argument
|
||||
first_xcompiler_arg=1
|
||||
|
||||
temp_dir=${TMPDIR:-/tmp}
|
||||
|
||||
# Check if we have an optimization argument already
|
||||
optimization_applied=0
|
||||
|
||||
# Check if we have -std=c++X or --std=c++X already
|
||||
stdcxx_applied=0
|
||||
|
||||
# Run nvcc a second time to generate dependencies if needed
|
||||
depfile_separate=0
|
||||
depfile_output_arg=""
|
||||
depfile_target_arg=""
|
||||
|
||||
#echo "Arguments: $# $@"
|
||||
|
||||
while [ $# -gt 0 ]
|
||||
do
|
||||
case $1 in
|
||||
#show the executed command
|
||||
--show|--nvcc-wrapper-show)
|
||||
dry_run=1
|
||||
;;
|
||||
#run host compilation only
|
||||
--host-only)
|
||||
host_only=1
|
||||
;;
|
||||
#replace '#pragma ident' with '#ident' this is needed to compile OpenMPI due to a configure script bug and a non standardized behaviour of pragma with macros
|
||||
--replace-pragma-ident)
|
||||
replace_pragma_ident=1
|
||||
;;
|
||||
#handle source files to be compiled as cuda files
|
||||
*.cpp|*.cxx|*.cc|*.C|*.c++|*.cu)
|
||||
cpp_files="$cpp_files $1"
|
||||
;;
|
||||
# Ensure we only have one optimization flag because NVCC doesn't allow muliple
|
||||
-O*)
|
||||
if [ $optimization_applied -eq 1 ]; then
|
||||
echo "nvcc_wrapper - *warning* you have set multiple optimization flags (-O*), only the first is used because nvcc can only accept a single optimization setting."
|
||||
else
|
||||
shared_args="$shared_args $1"
|
||||
optimization_applied=1
|
||||
fi
|
||||
;;
|
||||
#Handle shared args (valid for both nvcc and the host compiler)
|
||||
-D*|-I*|-L*|-l*|-g|--help|--version|-E|-M|-shared)
|
||||
shared_args="$shared_args $1"
|
||||
;;
|
||||
#Handle compilation argument
|
||||
-c)
|
||||
compile_arg="$1"
|
||||
;;
|
||||
#Handle output argument
|
||||
-o)
|
||||
output_arg="$output_arg $1 $2"
|
||||
shift
|
||||
;;
|
||||
# Handle depfile arguments. We map them to a separate call to nvcc.
|
||||
-MD|-MMD)
|
||||
depfile_separate=1
|
||||
host_only_args="$host_only_args $1"
|
||||
;;
|
||||
-MF)
|
||||
depfile_output_arg="-o $2"
|
||||
host_only_args="$host_only_args $1 $2"
|
||||
shift
|
||||
;;
|
||||
-MT)
|
||||
depfile_target_arg="$1 $2"
|
||||
host_only_args="$host_only_args $1 $2"
|
||||
shift
|
||||
;;
|
||||
#Handle known nvcc args
|
||||
-gencode*|--dryrun|--verbose|--keep|--keep-dir*|-G|--relocatable-device-code*|-lineinfo|-expt-extended-lambda|--resource-usage|-Xptxas*)
|
||||
cuda_args="$cuda_args $1"
|
||||
;;
|
||||
#Handle more known nvcc args
|
||||
--expt-extended-lambda|--expt-relaxed-constexpr)
|
||||
cuda_args="$cuda_args $1"
|
||||
;;
|
||||
#Handle known nvcc args that have an argument
|
||||
-rdc|-maxrregcount|--default-stream)
|
||||
cuda_args="$cuda_args $1 $2"
|
||||
shift
|
||||
;;
|
||||
#Handle c++11
|
||||
--std=c++11|-std=c++11|--std=c++14|-std=c++14|--std=c++1z|-std=c++1z)
|
||||
if [ $stdcxx_applied -eq 1 ]; then
|
||||
echo "nvcc_wrapper - *warning* you have set multiple optimization flags (-std=c++1* or --std=c++1*), only the first is used because nvcc can only accept a single std setting"
|
||||
else
|
||||
shared_args="$shared_args $1"
|
||||
stdcxx_applied=1
|
||||
fi
|
||||
;;
|
||||
|
||||
#strip of -std=c++98 due to nvcc warnings and Tribits will place both -std=c++11 and -std=c++98
|
||||
-std=c++98|--std=c++98)
|
||||
;;
|
||||
#strip of pedantic because it produces endless warnings about #LINE added by the preprocessor
|
||||
-pedantic|-Wpedantic|-ansi)
|
||||
;;
|
||||
#strip of -Woverloaded-virtual to avoid "cc1: warning: command line option ‘-Woverloaded-virtual’ is valid for C++/ObjC++ but not for C"
|
||||
-Woverloaded-virtual)
|
||||
;;
|
||||
#strip -Xcompiler because we add it
|
||||
-Xcompiler)
|
||||
if [ $first_xcompiler_arg -eq 1 ]; then
|
||||
xcompiler_args="$2"
|
||||
first_xcompiler_arg=0
|
||||
else
|
||||
xcompiler_args="$xcompiler_args,$2"
|
||||
fi
|
||||
shift
|
||||
;;
|
||||
#strip of "-x cu" because we add that
|
||||
-x)
|
||||
if [[ $2 != "cu" ]]; then
|
||||
if [ $first_xcompiler_arg -eq 1 ]; then
|
||||
xcompiler_args="-x,$2"
|
||||
first_xcompiler_arg=0
|
||||
else
|
||||
xcompiler_args="$xcompiler_args,-x,$2"
|
||||
fi
|
||||
fi
|
||||
shift
|
||||
;;
|
||||
#Handle -ccbin (if its not set we can set it to a default value)
|
||||
-ccbin)
|
||||
cuda_args="$cuda_args $1 $2"
|
||||
ccbin_set=1
|
||||
host_compiler=$2
|
||||
shift
|
||||
;;
|
||||
#Handle -arch argument (if its not set use a default
|
||||
-arch*)
|
||||
cuda_args="$cuda_args $1"
|
||||
arch_set=1
|
||||
;;
|
||||
#Handle -Xcudafe argument
|
||||
-Xcudafe)
|
||||
cuda_args="$cuda_args -Xcudafe $2"
|
||||
shift
|
||||
;;
|
||||
#Handle args that should be sent to the linker
|
||||
-Wl*)
|
||||
xlinker_args="$xlinker_args -Xlinker ${1:4:${#1}}"
|
||||
host_linker_args="$host_linker_args ${1:4:${#1}}"
|
||||
;;
|
||||
#Handle object files: -x cu applies to all input files, so give them to linker, except if only linking
|
||||
*.a|*.so|*.o|*.obj)
|
||||
object_files="$object_files $1"
|
||||
object_files_xlinker="$object_files_xlinker -Xlinker $1"
|
||||
;;
|
||||
#Handle object files which always need to use "-Xlinker": -x cu applies to all input files, so give them to linker, except if only linking
|
||||
@*|*.dylib)
|
||||
object_files="$object_files -Xlinker $1"
|
||||
object_files_xlinker="$object_files_xlinker -Xlinker $1"
|
||||
;;
|
||||
#Handle shared libraries with *.so.* names which nvcc can't do.
|
||||
*.so.*)
|
||||
shared_versioned_libraries_host="$shared_versioned_libraries_host $1"
|
||||
shared_versioned_libraries="$shared_versioned_libraries -Xlinker $1"
|
||||
;;
|
||||
#All other args are sent to the host compiler
|
||||
*)
|
||||
if [ $first_xcompiler_arg -eq 1 ]; then
|
||||
xcompiler_args=$1
|
||||
first_xcompiler_arg=0
|
||||
else
|
||||
xcompiler_args="$xcompiler_args,$1"
|
||||
fi
|
||||
;;
|
||||
esac
|
||||
|
||||
shift
|
||||
done
|
||||
|
||||
#Add default host compiler if necessary
|
||||
if [ $ccbin_set -ne 1 ]; then
|
||||
cuda_args="$cuda_args -ccbin $host_compiler"
|
||||
fi
|
||||
|
||||
#Add architecture command
|
||||
if [ $arch_set -ne 1 ]; then
|
||||
cuda_args="$cuda_args -arch=$default_arch"
|
||||
fi
|
||||
|
||||
#Compose compilation command
|
||||
nvcc_command="nvcc $cuda_args $shared_args $xlinker_args $shared_versioned_libraries"
|
||||
if [ $first_xcompiler_arg -eq 0 ]; then
|
||||
nvcc_command="$nvcc_command -Xcompiler $xcompiler_args"
|
||||
fi
|
||||
|
||||
#Compose host only command
|
||||
host_command="$host_compiler $shared_args $host_only_args $compile_arg $output_arg $xcompiler_args $host_linker_args $shared_versioned_libraries_host"
|
||||
|
||||
#nvcc does not accept '#pragma ident SOME_MACRO_STRING' but it does accept '#ident SOME_MACRO_STRING'
|
||||
if [ $replace_pragma_ident -eq 1 ]; then
|
||||
cpp_files2=""
|
||||
for file in $cpp_files
|
||||
do
|
||||
var=`grep pragma ${file} | grep ident | grep "#"`
|
||||
if [ "${#var}" -gt 0 ]
|
||||
then
|
||||
sed 's/#[\ \t]*pragma[\ \t]*ident/#ident/g' $file > $temp_dir/nvcc_wrapper_tmp_$file
|
||||
cpp_files2="$cpp_files2 $temp_dir/nvcc_wrapper_tmp_$file"
|
||||
else
|
||||
cpp_files2="$cpp_files2 $file"
|
||||
fi
|
||||
done
|
||||
cpp_files=$cpp_files2
|
||||
#echo $cpp_files
|
||||
fi
|
||||
|
||||
if [ "$cpp_files" ]; then
|
||||
nvcc_command="$nvcc_command $object_files_xlinker -x cu $cpp_files"
|
||||
else
|
||||
nvcc_command="$nvcc_command $object_files"
|
||||
fi
|
||||
|
||||
if [ "$cpp_files" ]; then
|
||||
host_command="$host_command $object_files $cpp_files"
|
||||
else
|
||||
host_command="$host_command $object_files"
|
||||
fi
|
||||
|
||||
if [ $depfile_separate -eq 1 ]; then
|
||||
# run nvcc a second time to generate dependencies (without compiling)
|
||||
nvcc_depfile_command="$nvcc_command -M $depfile_target_arg $depfile_output_arg"
|
||||
else
|
||||
nvcc_depfile_command=""
|
||||
fi
|
||||
|
||||
nvcc_command="$nvcc_command $compile_arg $output_arg"
|
||||
|
||||
#Print command for dryrun
|
||||
if [ $dry_run -eq 1 ]; then
|
||||
if [ $host_only -eq 1 ]; then
|
||||
echo $host_command
|
||||
elif [ -n "$nvcc_depfile_command" ]; then
|
||||
echo $nvcc_command "&&" $nvcc_depfile_command
|
||||
else
|
||||
echo $nvcc_command
|
||||
fi
|
||||
exit 0
|
||||
fi
|
||||
|
||||
#Run compilation command
|
||||
if [ $host_only -eq 1 ]; then
|
||||
$host_command
|
||||
elif [ -n "$nvcc_depfile_command" ]; then
|
||||
$nvcc_command && $nvcc_depfile_command
|
||||
else
|
||||
$nvcc_command
|
||||
fi
|
||||
error_code=$?
|
||||
|
||||
#Report error code
|
||||
exit $error_code
|
|
@ -14,25 +14,52 @@ PROCESSOR=`uname -p`
|
|||
|
||||
if [[ "$HOSTNAME" =~ (white|ride).* ]]; then
|
||||
MACHINE=white
|
||||
elif [[ "$HOSTNAME" =~ .*bowman.* ]]; then
|
||||
module load git
|
||||
fi
|
||||
|
||||
if [[ "$HOSTNAME" =~ .*bowman.* ]]; then
|
||||
MACHINE=bowman
|
||||
elif [[ "$HOSTNAME" =~ n.* ]]; then # Warning: very generic name
|
||||
module load git
|
||||
fi
|
||||
|
||||
if [[ "$HOSTNAME" =~ n.* ]]; then # Warning: very generic name
|
||||
if [[ "$PROCESSOR" = "aarch64" ]]; then
|
||||
MACHINE=sullivan
|
||||
module load git
|
||||
fi
|
||||
elif [[ "$HOSTNAME" =~ node.* ]]; then # Warning: very generic name
|
||||
fi
|
||||
|
||||
if [[ "$HOSTNAME" =~ node.* ]]; then # Warning: very generic name
|
||||
if [[ "$MACHINE" = "" ]]; then
|
||||
MACHINE=shepard
|
||||
elif [[ "$HOSTNAME" =~ apollo ]]; then
|
||||
module load git
|
||||
fi
|
||||
fi
|
||||
|
||||
if [[ "$HOSTNAME" =~ apollo ]]; then
|
||||
MACHINE=apollo
|
||||
elif [[ "$HOSTNAME" =~ sullivan ]]; then
|
||||
module load git
|
||||
fi
|
||||
|
||||
if [[ "$HOSTNAME" =~ sullivan ]]; then
|
||||
MACHINE=sullivan
|
||||
elif [ ! -z "$SEMS_MODULEFILES_ROOT" ]; then
|
||||
MACHINE=sems
|
||||
else
|
||||
module load git
|
||||
fi
|
||||
|
||||
if [ ! -z "$SEMS_MODULEFILES_ROOT" ]; then
|
||||
if [[ "$MACHINE" = "" ]]; then
|
||||
MACHINE=sems
|
||||
module load sems-git
|
||||
fi
|
||||
fi
|
||||
|
||||
if [[ "$MACHINE" = "" ]]; then
|
||||
echo "Unrecognized machine" >&2
|
||||
exit 1
|
||||
fi
|
||||
|
||||
echo "Running on machine: $MACHINE"
|
||||
|
||||
GCC_BUILD_LIST="OpenMP,Pthread,Serial,OpenMP_Serial,Pthread_Serial"
|
||||
IBM_BUILD_LIST="OpenMP,Serial,OpenMP_Serial"
|
||||
ARM_GCC_BUILD_LIST="OpenMP,Serial,OpenMP_Serial"
|
||||
|
@ -45,7 +72,8 @@ GCC_WARNING_FLAGS="-Wall,-Wshadow,-pedantic,-Werror,-Wsign-compare,-Wtype-limits
|
|||
IBM_WARNING_FLAGS="-Wall,-Wshadow,-pedantic,-Werror,-Wsign-compare,-Wtype-limits,-Wuninitialized"
|
||||
CLANG_WARNING_FLAGS="-Wall,-Wshadow,-pedantic,-Werror,-Wsign-compare,-Wtype-limits,-Wuninitialized"
|
||||
INTEL_WARNING_FLAGS="-Wall,-Wshadow,-pedantic,-Werror,-Wsign-compare,-Wtype-limits,-Wuninitialized"
|
||||
CUDA_WARNING_FLAGS="-Wall,-Wshadow,-pedantic,-Werror,-Wsign-compare,-Wtype-limits,-Wuninitialized"
|
||||
#CUDA_WARNING_FLAGS="-Wall,-Wshadow,-pedantic,-Werror,-Wsign-compare,-Wtype-limits,-Wuninitialized"
|
||||
CUDA_WARNING_FLAGS="-Wall,-Wshadow,-pedantic,-Wsign-compare,-Wtype-limits,-Wuninitialized"
|
||||
PGI_WARNING_FLAGS=""
|
||||
|
||||
# Default. Machine specific can override.
|
||||
|
@ -142,6 +170,18 @@ else
|
|||
KOKKOS_PATH=$( cd $KOKKOS_PATH && pwd )
|
||||
fi
|
||||
|
||||
UNCOMMITTED=`cd ${KOKKOS_PATH}; git status --porcelain 2>/dev/null`
|
||||
if ! [ -z "$UNCOMMITTED" ]; then
|
||||
echo "WARNING!! THE FOLLOWING CHANGES ARE UNCOMMITTED!! :"
|
||||
echo "$UNCOMMITTED"
|
||||
echo ""
|
||||
fi
|
||||
|
||||
GITSTATUS=`cd ${KOKKOS_PATH}; git log -n 1 --format=oneline`
|
||||
echo "Repository Status: " ${GITSTATUS}
|
||||
echo ""
|
||||
echo ""
|
||||
|
||||
#
|
||||
# Machine specific config.
|
||||
#
|
||||
|
@ -149,7 +189,7 @@ fi
|
|||
if [ "$MACHINE" = "sems" ]; then
|
||||
source /projects/sems/modulefiles/utils/sems-modules-init.sh
|
||||
|
||||
BASE_MODULE_LIST="sems-env,kokkos-env,sems-<COMPILER_NAME>/<COMPILER_VERSION>,kokkos-hwloc/1.10.1/base"
|
||||
BASE_MODULE_LIST="sems-env,kokkos-env,kokkos-hwloc/1.10.1/base,sems-<COMPILER_NAME>/<COMPILER_VERSION>"
|
||||
CUDA_MODULE_LIST="sems-env,kokkos-env,kokkos-<COMPILER_NAME>/<COMPILER_VERSION>,sems-gcc/4.8.4,kokkos-hwloc/1.10.1/base"
|
||||
CUDA8_MODULE_LIST="sems-env,kokkos-env,kokkos-<COMPILER_NAME>/<COMPILER_VERSION>,sems-gcc/5.3.0,kokkos-hwloc/1.10.1/base"
|
||||
|
||||
|
@ -178,9 +218,9 @@ if [ "$MACHINE" = "sems" ]; then
|
|||
"clang/3.7.1 $BASE_MODULE_LIST $CLANG_BUILD_LIST clang++ $CLANG_WARNING_FLAGS"
|
||||
"clang/3.8.1 $BASE_MODULE_LIST $CLANG_BUILD_LIST clang++ $CLANG_WARNING_FLAGS"
|
||||
"clang/3.9.0 $BASE_MODULE_LIST $CLANG_BUILD_LIST clang++ $CLANG_WARNING_FLAGS"
|
||||
"cuda/7.0.28 $CUDA_MODULE_LIST $CUDA_BUILD_LIST $KOKKOS_PATH/config/nvcc_wrapper $CUDA_WARNING_FLAGS"
|
||||
"cuda/7.5.18 $CUDA_MODULE_LIST $CUDA_BUILD_LIST $KOKKOS_PATH/config/nvcc_wrapper $CUDA_WARNING_FLAGS"
|
||||
"cuda/8.0.44 $CUDA8_MODULE_LIST $CUDA_BUILD_LIST $KOKKOS_PATH/config/nvcc_wrapper $CUDA_WARNING_FLAGS"
|
||||
"cuda/7.0.28 $CUDA_MODULE_LIST $CUDA_BUILD_LIST $KOKKOS_PATH/bin/nvcc_wrapper $CUDA_WARNING_FLAGS"
|
||||
"cuda/7.5.18 $CUDA_MODULE_LIST $CUDA_BUILD_LIST $KOKKOS_PATH/bin/nvcc_wrapper $CUDA_WARNING_FLAGS"
|
||||
"cuda/8.0.44 $CUDA8_MODULE_LIST $CUDA_BUILD_LIST $KOKKOS_PATH/bin/nvcc_wrapper $CUDA_WARNING_FLAGS"
|
||||
)
|
||||
fi
|
||||
elif [ "$MACHINE" = "white" ]; then
|
||||
|
@ -191,14 +231,14 @@ elif [ "$MACHINE" = "white" ]; then
|
|||
BASE_MODULE_LIST="<COMPILER_NAME>/<COMPILER_VERSION>"
|
||||
IBM_MODULE_LIST="<COMPILER_NAME>/xl/<COMPILER_VERSION>"
|
||||
CUDA_MODULE_LIST="<COMPILER_NAME>/<COMPILER_VERSION>,gcc/5.4.0"
|
||||
CUDA_MODULE_LIST2="<COMPILER_NAME>/<COMPILER_VERSION>,gcc/6.3.0,ibm/xl/13.1.6-BETA"
|
||||
CUDA_MODULE_LIST2="<COMPILER_NAME>/<COMPILER_VERSION>,gcc/6.3.0,ibm/xl/13.1.6"
|
||||
|
||||
# Don't do pthread on white.
|
||||
GCC_BUILD_LIST="OpenMP,Serial,OpenMP_Serial"
|
||||
|
||||
# Format: (compiler module-list build-list exe-name warning-flag)
|
||||
COMPILERS=("gcc/5.4.0 $BASE_MODULE_LIST $IBM_BUILD_LIST g++ $GCC_WARNING_FLAGS"
|
||||
"ibm/13.1.3 $IBM_MODULE_LIST $IBM_BUILD_LIST xlC $IBM_WARNING_FLAGS"
|
||||
"ibm/13.1.6 $IBM_MODULE_LIST $IBM_BUILD_LIST xlC $IBM_WARNING_FLAGS"
|
||||
"cuda/8.0.44 $CUDA_MODULE_LIST $CUDA_IBM_BUILD_LIST ${KOKKOS_PATH}/bin/nvcc_wrapper $CUDA_WARNING_FLAGS"
|
||||
"cuda/9.0.103 $CUDA_MODULE_LIST2 $CUDA_IBM_BUILD_LIST ${KOKKOS_PATH}/bin/nvcc_wrapper $CUDA_WARNING_FLAGS"
|
||||
)
|
||||
|
@ -281,7 +321,7 @@ elif [ "$MACHINE" = "apollo" ]; then
|
|||
CUDA_MODULE_LIST="sems-env,kokkos-env,kokkos-<COMPILER_NAME>/<COMPILER_VERSION>,sems-gcc/4.8.4,kokkos-hwloc/1.10.1/base"
|
||||
CUDA8_MODULE_LIST="sems-env,kokkos-env,kokkos-<COMPILER_NAME>/<COMPILER_VERSION>,sems-gcc/5.3.0,kokkos-hwloc/1.10.1/base"
|
||||
|
||||
CLANG_MODULE_LIST="sems-env,kokkos-env,sems-git,sems-cmake/3.5.2,<COMPILER_NAME>/<COMPILER_VERSION>,cuda/8.0.44"
|
||||
CLANG_MODULE_LIST="sems-env,kokkos-env,sems-git,sems-cmake/3.5.2,<COMPILER_NAME>/<COMPILER_VERSION>,cuda/9.0.69"
|
||||
NVCC_MODULE_LIST="sems-env,kokkos-env,sems-git,sems-cmake/3.5.2,<COMPILER_NAME>/<COMPILER_VERSION>,sems-gcc/5.3.0"
|
||||
|
||||
BUILD_LIST_CUDA_NVCC="Cuda_Serial,Cuda_OpenMP"
|
||||
|
@ -294,13 +334,13 @@ elif [ "$MACHINE" = "apollo" ]; then
|
|||
"gcc/5.1.0 $BASE_MODULE_LIST "Serial" g++ $GCC_WARNING_FLAGS"
|
||||
"intel/16.0.1 $BASE_MODULE_LIST "OpenMP" icpc $INTEL_WARNING_FLAGS"
|
||||
"clang/3.9.0 $BASE_MODULE_LIST "Pthread_Serial" clang++ $CLANG_WARNING_FLAGS"
|
||||
"clang/4.0.0 $CLANG_MODULE_LIST "Cuda_Pthread" clang++ $CUDA_WARNING_FLAGS"
|
||||
"cuda/8.0.44 $CUDA_MODULE_LIST "Cuda_OpenMP" $KOKKOS_PATH/bin/nvcc_wrapper $CUDA_WARNING_FLAGS"
|
||||
"clang/6.0 $CLANG_MODULE_LIST "Cuda_Pthread" clang++ $CUDA_WARNING_FLAGS"
|
||||
"cuda/9.1 $CUDA_MODULE_LIST "Cuda_OpenMP" $KOKKOS_PATH/bin/nvcc_wrapper $CUDA_WARNING_FLAGS"
|
||||
)
|
||||
else
|
||||
# Format: (compiler module-list build-list exe-name warning-flag)
|
||||
COMPILERS=("cuda/8.0.44 $CUDA8_MODULE_LIST $BUILD_LIST_CUDA_NVCC $KOKKOS_PATH/bin/nvcc_wrapper $CUDA_WARNING_FLAGS"
|
||||
"clang/4.0.0 $CLANG_MODULE_LIST $BUILD_LIST_CUDA_CLANG clang++ $CUDA_WARNING_FLAGS"
|
||||
COMPILERS=("cuda/9.1 $CUDA8_MODULE_LIST $BUILD_LIST_CUDA_NVCC $KOKKOS_PATH/bin/nvcc_wrapper $CUDA_WARNING_FLAGS"
|
||||
"clang/6.0 $CLANG_MODULE_LIST $BUILD_LIST_CUDA_CLANG clang++ $CUDA_WARNING_FLAGS"
|
||||
"clang/3.9.0 $CLANG_MODULE_LIST $BUILD_LIST_CLANG clang++ $CLANG_WARNING_FLAGS"
|
||||
"gcc/4.8.4 $BASE_MODULE_LIST $GCC_BUILD_LIST g++ $GCC_WARNING_FLAGS"
|
||||
"gcc/4.9.3 $BASE_MODULE_LIST $GCC_BUILD_LIST g++ $GCC_WARNING_FLAGS"
|
||||
|
@ -311,13 +351,11 @@ elif [ "$MACHINE" = "apollo" ]; then
|
|||
"intel/17.0.1 $BASE_MODULE_LIST $INTEL_BUILD_LIST icpc $INTEL_WARNING_FLAGS"
|
||||
"clang/3.5.2 $BASE_MODULE_LIST $CLANG_BUILD_LIST clang++ $CLANG_WARNING_FLAGS"
|
||||
"clang/3.6.1 $BASE_MODULE_LIST $CLANG_BUILD_LIST clang++ $CLANG_WARNING_FLAGS"
|
||||
"cuda/7.0.28 $CUDA_MODULE_LIST $CUDA_BUILD_LIST $KOKKOS_PATH/bin/nvcc_wrapper $CUDA_WARNING_FLAGS"
|
||||
"cuda/7.5.18 $CUDA_MODULE_LIST $CUDA_BUILD_LIST $KOKKOS_PATH/bin/nvcc_wrapper $CUDA_WARNING_FLAGS"
|
||||
)
|
||||
fi
|
||||
|
||||
if [ -z "$ARCH_FLAG" ]; then
|
||||
ARCH_FLAG="--arch=SNB,Kepler35"
|
||||
ARCH_FLAG="--arch=SNB,Volta70"
|
||||
fi
|
||||
|
||||
NUM_JOBS_TO_RUN_IN_PARALLEL=2
|
||||
|
@ -700,17 +738,19 @@ wait_summarize_and_exit() {
|
|||
echo $passed_test $(cat $PASSED_DIR/$passed_test)
|
||||
done
|
||||
|
||||
echo "#######################################################"
|
||||
echo "FAILED TESTS"
|
||||
echo "#######################################################"
|
||||
|
||||
local failed_test
|
||||
local -i rv=0
|
||||
for failed_test in $(\ls -1 $FAILED_DIR | sort)
|
||||
do
|
||||
echo $failed_test "("$(cat $FAILED_DIR/$failed_test)" failed)"
|
||||
rv=$rv+1
|
||||
done
|
||||
if [ "$(ls -A $FAILED_DIR)" ]; then
|
||||
echo "#######################################################"
|
||||
echo "FAILED TESTS"
|
||||
echo "#######################################################"
|
||||
|
||||
local failed_test
|
||||
for failed_test in $(\ls -1 $FAILED_DIR | sort)
|
||||
do
|
||||
echo $failed_test "("$(cat $FAILED_DIR/$failed_test)" failed)"
|
||||
rv=$rv+1
|
||||
done
|
||||
fi
|
||||
|
||||
exit $rv
|
||||
}
|
||||
|
|
|
@ -35,7 +35,7 @@
|
|||
// NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE OF THIS
|
||||
// SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.
|
||||
//
|
||||
// Questions? Contact H. Carter Edwards (hcedwar@sandia.gov)
|
||||
// Questions? Contact Christian R. Trott (crtrott@sandia.gov)
|
||||
//
|
||||
// ************************************************************************
|
||||
//@HEADER
|
||||
|
|
|
@ -35,7 +35,7 @@
|
|||
// NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE OF THIS
|
||||
// SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.
|
||||
//
|
||||
// Questions? Contact H. Carter Edwards (hcedwar@sandia.gov)
|
||||
// Questions? Contact Christian R. Trott (crtrott@sandia.gov)
|
||||
//
|
||||
// ************************************************************************
|
||||
//@HEADER
|
||||
|
@ -64,8 +64,8 @@ struct InitViewFunctor {
|
|||
|
||||
KOKKOS_INLINE_FUNCTION
|
||||
void operator()(const int i) const {
|
||||
for (unsigned j = 0; j < _inview.dimension(1); ++j) {
|
||||
for (unsigned k = 0; k < _inview.dimension(2); ++k) {
|
||||
for (unsigned j = 0; j < _inview.extent(1); ++j) {
|
||||
for (unsigned k = 0; k < _inview.extent(2); ++k) {
|
||||
_inview(i,j,k) = i/2 -j*j + k/3;
|
||||
}
|
||||
}
|
||||
|
@ -84,8 +84,8 @@ struct InitViewFunctor {
|
|||
|
||||
KOKKOS_INLINE_FUNCTION
|
||||
void operator()(const int i) const {
|
||||
for (unsigned j = 0; j < _inview.dimension(1); ++j) {
|
||||
for (unsigned k = 0; k < _inview.dimension(2); ++k) {
|
||||
for (unsigned j = 0; j < _inview.extent(1); ++j) {
|
||||
for (unsigned k = 0; k < _inview.extent(2); ++k) {
|
||||
_outview(i) += _inview(i,j,k) ;
|
||||
}
|
||||
}
|
||||
|
@ -104,8 +104,8 @@ struct InitStrideViewFunctor {
|
|||
|
||||
KOKKOS_INLINE_FUNCTION
|
||||
void operator()(const int i) const {
|
||||
for (unsigned j = 0; j < _inview.dimension(1); ++j) {
|
||||
for (unsigned k = 0; k < _inview.dimension(2); ++k) {
|
||||
for (unsigned j = 0; j < _inview.extent(1); ++j) {
|
||||
for (unsigned k = 0; k < _inview.extent(2); ++k) {
|
||||
_inview(i,j,k) = i/2 -j*j + k/3;
|
||||
}
|
||||
}
|
||||
|
@ -123,8 +123,8 @@ struct InitViewRank7Functor {
|
|||
|
||||
KOKKOS_INLINE_FUNCTION
|
||||
void operator()(const int i) const {
|
||||
for (unsigned j = 0; j < _inview.dimension(1); ++j) {
|
||||
for (unsigned k = 0; k < _inview.dimension(2); ++k) {
|
||||
for (unsigned j = 0; j < _inview.extent(1); ++j) {
|
||||
for (unsigned k = 0; k < _inview.extent(2); ++k) {
|
||||
_inview(i,j,k,0,0,0,0) = i/2 -j*j + k/3;
|
||||
}
|
||||
}
|
||||
|
@ -143,8 +143,8 @@ struct InitDynRankViewFunctor {
|
|||
|
||||
KOKKOS_INLINE_FUNCTION
|
||||
void operator()(const int i) const {
|
||||
for (unsigned j = 0; j < _inview.dimension(1); ++j) {
|
||||
for (unsigned k = 0; k < _inview.dimension(2); ++k) {
|
||||
for (unsigned j = 0; j < _inview.extent(1); ++j) {
|
||||
for (unsigned k = 0; k < _inview.extent(2); ++k) {
|
||||
_inview(i,j,k) = i/2 -j*j + k/3;
|
||||
}
|
||||
}
|
||||
|
@ -163,8 +163,8 @@ struct InitDynRankViewFunctor {
|
|||
|
||||
KOKKOS_INLINE_FUNCTION
|
||||
void operator()(const int i) const {
|
||||
for (unsigned j = 0; j < _inview.dimension(1); ++j) {
|
||||
for (unsigned k = 0; k < _inview.dimension(2); ++k) {
|
||||
for (unsigned j = 0; j < _inview.extent(1); ++j) {
|
||||
for (unsigned k = 0; k < _inview.extent(2); ++k) {
|
||||
_outview(i) += _inview(i,j,k) ;
|
||||
}
|
||||
}
|
||||
|
|
|
@ -34,7 +34,7 @@
|
|||
// NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE OF THIS
|
||||
// SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.
|
||||
//
|
||||
// Questions? Contact H. Carter Edwards (hcedwar@sandia.gov)
|
||||
// Questions? Contact Christian R. Trott (crtrott@sandia.gov)
|
||||
//
|
||||
// ************************************************************************
|
||||
//@HEADER
|
||||
|
@ -76,7 +76,7 @@ struct generate_ids
|
|||
generate_ids( local_id_view & ids)
|
||||
: local_2_global(ids)
|
||||
{
|
||||
Kokkos::parallel_for(local_2_global.dimension_0(), *this);
|
||||
Kokkos::parallel_for(local_2_global.extent(0), *this);
|
||||
}
|
||||
|
||||
|
||||
|
@ -116,7 +116,7 @@ struct fill_map
|
|||
fill_map( global_id_view gIds, local_id_view lIds)
|
||||
: global_2_local(gIds) , local_2_global(lIds)
|
||||
{
|
||||
Kokkos::parallel_for(local_2_global.dimension_0(), *this);
|
||||
Kokkos::parallel_for(local_2_global.extent(0), *this);
|
||||
}
|
||||
|
||||
KOKKOS_INLINE_FUNCTION
|
||||
|
@ -143,7 +143,7 @@ struct find_test
|
|||
find_test( global_id_view gIds, local_id_view lIds, value_type & num_errors)
|
||||
: global_2_local(gIds) , local_2_global(lIds)
|
||||
{
|
||||
Kokkos::parallel_reduce(local_2_global.dimension_0(), *this, num_errors);
|
||||
Kokkos::parallel_reduce(local_2_global.extent(0), *this, num_errors);
|
||||
}
|
||||
|
||||
KOKKOS_INLINE_FUNCTION
|
||||
|
|
|
@ -35,7 +35,7 @@
|
|||
// NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE OF THIS
|
||||
// SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.
|
||||
//
|
||||
// Questions? Contact H. Carter Edwards (hcedwar@sandia.gov)
|
||||
// Questions? Contact Christian R. Trott (crtrott@sandia.gov)
|
||||
//
|
||||
// ************************************************************************
|
||||
//@HEADER
|
||||
|
|
|
@ -35,7 +35,7 @@
|
|||
// NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE OF THIS
|
||||
// SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.
|
||||
//
|
||||
// Questions? Contact H. Carter Edwards (hcedwar@sandia.gov)
|
||||
// Questions? Contact Christian R. Trott (crtrott@sandia.gov)
|
||||
//
|
||||
// ************************************************************************
|
||||
//@HEADER
|
||||
|
|
|
@ -35,7 +35,7 @@
|
|||
// NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE OF THIS
|
||||
// SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.
|
||||
//
|
||||
// Questions? Contact H. Carter Edwards (hcedwar@sandia.gov)
|
||||
// Questions? Contact Christian R. Trott (crtrott@sandia.gov)
|
||||
//
|
||||
// ************************************************************************
|
||||
//@HEADER
|
||||
|
|
|
@ -35,7 +35,7 @@
|
|||
// NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE OF THIS
|
||||
// SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.
|
||||
//
|
||||
// Questions? Contact H. Carter Edwards (hcedwar@sandia.gov)
|
||||
// Questions? Contact Christian R. Trott (crtrott@sandia.gov)
|
||||
//
|
||||
// ************************************************************************
|
||||
//@HEADER
|
||||
|
|
|
@ -35,7 +35,7 @@
|
|||
// NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE OF THIS
|
||||
// SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.
|
||||
//
|
||||
// Questions? Contact H. Carter Edwards (hcedwar@sandia.gov)
|
||||
// Questions? Contact Christian R. Trott (crtrott@sandia.gov)
|
||||
//
|
||||
// ************************************************************************
|
||||
//@HEADER
|
||||
|
|
|
@ -34,7 +34,7 @@
|
|||
// NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE OF THIS
|
||||
// SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.
|
||||
//
|
||||
// Questions? Contact H. Carter Edwards (hcedwar@sandia.gov)
|
||||
// Questions? Contact Christian R. Trott (crtrott@sandia.gov)
|
||||
//
|
||||
// ************************************************************************
|
||||
//@HEADER
|
||||
|
|
|
@ -35,7 +35,7 @@
|
|||
// NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE OF THIS
|
||||
// SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.
|
||||
//
|
||||
// Questions? Contact H. Carter Edwards (hcedwar@sandia.gov)
|
||||
// Questions? Contact Christian R. Trott (crtrott@sandia.gov)
|
||||
//
|
||||
// ************************************************************************
|
||||
//@HEADER
|
||||
|
@ -147,7 +147,7 @@ public:
|
|||
if (m_last_block_mask) {
|
||||
//clear the unused bits in the last block
|
||||
typedef Kokkos::Impl::DeepCopy< typename execution_space::memory_space, Kokkos::HostSpace > raw_deep_copy;
|
||||
raw_deep_copy( m_blocks.ptr_on_device() + (m_blocks.dimension_0() -1u), &m_last_block_mask, sizeof(unsigned));
|
||||
raw_deep_copy( m_blocks.data() + (m_blocks.extent(0) -1u), &m_last_block_mask, sizeof(unsigned));
|
||||
}
|
||||
}
|
||||
|
||||
|
@ -212,7 +212,7 @@ public:
|
|||
KOKKOS_FORCEINLINE_FUNCTION
|
||||
unsigned max_hint() const
|
||||
{
|
||||
return m_blocks.dimension_0();
|
||||
return m_blocks.extent(0);
|
||||
}
|
||||
|
||||
/// find a bit set to 1 near the hint
|
||||
|
@ -221,10 +221,10 @@ public:
|
|||
KOKKOS_INLINE_FUNCTION
|
||||
Kokkos::pair<bool, unsigned> find_any_set_near( unsigned hint , unsigned scan_direction = BIT_SCAN_FORWARD_MOVE_HINT_FORWARD ) const
|
||||
{
|
||||
const unsigned block_idx = (hint >> block_shift) < m_blocks.dimension_0() ? (hint >> block_shift) : 0;
|
||||
const unsigned block_idx = (hint >> block_shift) < m_blocks.extent(0) ? (hint >> block_shift) : 0;
|
||||
const unsigned offset = hint & block_mask;
|
||||
unsigned block = volatile_load(&m_blocks[ block_idx ]);
|
||||
block = !m_last_block_mask || (block_idx < (m_blocks.dimension_0()-1)) ? block : block & m_last_block_mask ;
|
||||
block = !m_last_block_mask || (block_idx < (m_blocks.extent(0)-1)) ? block : block & m_last_block_mask ;
|
||||
|
||||
return find_any_helper(block_idx, offset, block, scan_direction);
|
||||
}
|
||||
|
@ -238,7 +238,7 @@ public:
|
|||
const unsigned block_idx = hint >> block_shift;
|
||||
const unsigned offset = hint & block_mask;
|
||||
unsigned block = volatile_load(&m_blocks[ block_idx ]);
|
||||
block = !m_last_block_mask || (block_idx < (m_blocks.dimension_0()-1) ) ? ~block : ~block & m_last_block_mask ;
|
||||
block = !m_last_block_mask || (block_idx < (m_blocks.extent(0)-1) ) ? ~block : ~block & m_last_block_mask ;
|
||||
|
||||
return find_any_helper(block_idx, offset, block, scan_direction);
|
||||
}
|
||||
|
@ -281,8 +281,8 @@ private:
|
|||
unsigned update_hint( long long block_idx, unsigned offset, unsigned scan_direction ) const
|
||||
{
|
||||
block_idx += scan_direction & MOVE_HINT_BACKWARD ? -1 : 1;
|
||||
block_idx = block_idx >= 0 ? block_idx : m_blocks.dimension_0() - 1;
|
||||
block_idx = block_idx < static_cast<long long>(m_blocks.dimension_0()) ? block_idx : 0;
|
||||
block_idx = block_idx >= 0 ? block_idx : m_blocks.extent(0) - 1;
|
||||
block_idx = block_idx < static_cast<long long>(m_blocks.extent(0)) ? block_idx : 0;
|
||||
|
||||
return static_cast<unsigned>(block_idx)*block_size + offset;
|
||||
}
|
||||
|
@ -407,7 +407,7 @@ void deep_copy( Bitset<DstDevice> & dst, Bitset<SrcDevice> const& src)
|
|||
}
|
||||
|
||||
typedef Kokkos::Impl::DeepCopy< typename DstDevice::memory_space, typename SrcDevice::memory_space > raw_deep_copy;
|
||||
raw_deep_copy(dst.m_blocks.ptr_on_device(), src.m_blocks.ptr_on_device(), sizeof(unsigned)*src.m_blocks.dimension_0());
|
||||
raw_deep_copy(dst.m_blocks.data(), src.m_blocks.data(), sizeof(unsigned)*src.m_blocks.extent(0));
|
||||
}
|
||||
|
||||
template <typename DstDevice, typename SrcDevice>
|
||||
|
@ -418,7 +418,7 @@ void deep_copy( Bitset<DstDevice> & dst, ConstBitset<SrcDevice> const& src)
|
|||
}
|
||||
|
||||
typedef Kokkos::Impl::DeepCopy< typename DstDevice::memory_space, typename SrcDevice::memory_space > raw_deep_copy;
|
||||
raw_deep_copy(dst.m_blocks.ptr_on_device(), src.m_blocks.ptr_on_device(), sizeof(unsigned)*src.m_blocks.dimension_0());
|
||||
raw_deep_copy(dst.m_blocks.data(), src.m_blocks.data(), sizeof(unsigned)*src.m_blocks.extent(0));
|
||||
}
|
||||
|
||||
template <typename DstDevice, typename SrcDevice>
|
||||
|
@ -429,7 +429,7 @@ void deep_copy( ConstBitset<DstDevice> & dst, ConstBitset<SrcDevice> const& src)
|
|||
}
|
||||
|
||||
typedef Kokkos::Impl::DeepCopy< typename DstDevice::memory_space, typename SrcDevice::memory_space > raw_deep_copy;
|
||||
raw_deep_copy(dst.m_blocks.ptr_on_device(), src.m_blocks.ptr_on_device(), sizeof(unsigned)*src.m_blocks.dimension_0());
|
||||
raw_deep_copy(dst.m_blocks.data(), src.m_blocks.data(), sizeof(unsigned)*src.m_blocks.extent(0));
|
||||
}
|
||||
|
||||
} // namespace Kokkos
|
||||
|
|
|
@ -35,7 +35,7 @@
|
|||
// NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE OF THIS
|
||||
// SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.
|
||||
//
|
||||
// Questions? Contact H. Carter Edwards (hcedwar@sandia.gov)
|
||||
// Questions? Contact Christian R. Trott (crtrott@sandia.gov)
|
||||
//
|
||||
// ************************************************************************
|
||||
//@HEADER
|
||||
|
@ -262,14 +262,14 @@ public:
|
|||
modified_host (View<unsigned int,LayoutLeft,typename t_host::execution_space> ("DualView::modified_host"))
|
||||
{
|
||||
if ( int(d_view.rank) != int(h_view.rank) ||
|
||||
d_view.dimension_0() != h_view.dimension_0() ||
|
||||
d_view.dimension_1() != h_view.dimension_1() ||
|
||||
d_view.dimension_2() != h_view.dimension_2() ||
|
||||
d_view.dimension_3() != h_view.dimension_3() ||
|
||||
d_view.dimension_4() != h_view.dimension_4() ||
|
||||
d_view.dimension_5() != h_view.dimension_5() ||
|
||||
d_view.dimension_6() != h_view.dimension_6() ||
|
||||
d_view.dimension_7() != h_view.dimension_7() ||
|
||||
d_view.extent(0) != h_view.extent(0) ||
|
||||
d_view.extent(1) != h_view.extent(1) ||
|
||||
d_view.extent(2) != h_view.extent(2) ||
|
||||
d_view.extent(3) != h_view.extent(3) ||
|
||||
d_view.extent(4) != h_view.extent(4) ||
|
||||
d_view.extent(5) != h_view.extent(5) ||
|
||||
d_view.extent(6) != h_view.extent(6) ||
|
||||
d_view.extent(7) != h_view.extent(7) ||
|
||||
d_view.stride_0() != h_view.stride_0() ||
|
||||
d_view.stride_1() != h_view.stride_1() ||
|
||||
d_view.stride_2() != h_view.stride_2() ||
|
||||
|
@ -503,6 +503,18 @@ public:
|
|||
/* Realloc on Device */
|
||||
|
||||
::Kokkos::realloc(d_view,n0,n1,n2,n3,n4,n5,n6,n7);
|
||||
|
||||
const bool sizeMismatch = ( h_view.extent(0) != n0 ) ||
|
||||
( h_view.extent(1) != n1 ) ||
|
||||
( h_view.extent(2) != n2 ) ||
|
||||
( h_view.extent(3) != n3 ) ||
|
||||
( h_view.extent(4) != n4 ) ||
|
||||
( h_view.extent(5) != n5 ) ||
|
||||
( h_view.extent(6) != n6 ) ||
|
||||
( h_view.extent(7) != n7 );
|
||||
if ( sizeMismatch )
|
||||
::Kokkos::resize(h_view,n0,n1,n2,n3,n4,n5,n6,n7);
|
||||
|
||||
t_host temp_view = create_mirror_view( d_view );
|
||||
|
||||
/* Remap on Host */
|
||||
|
@ -510,6 +522,8 @@ public:
|
|||
|
||||
h_view = temp_view;
|
||||
|
||||
d_view = create_mirror_view( typename t_dev::execution_space(), h_view );
|
||||
|
||||
/* Mark Host copy as modified */
|
||||
modified_host() = modified_host()+1;
|
||||
}
|
||||
|
@ -530,22 +544,34 @@ public:
|
|||
d_view.stride(stride_);
|
||||
}
|
||||
|
||||
template< typename iType >
|
||||
KOKKOS_INLINE_FUNCTION constexpr
|
||||
typename std::enable_if< std::is_integral<iType>::value , size_t >::type
|
||||
extent( const iType & r ) const
|
||||
{ return d_view.extent(r); }
|
||||
|
||||
template< typename iType >
|
||||
KOKKOS_INLINE_FUNCTION constexpr
|
||||
typename std::enable_if< std::is_integral<iType>::value , int >::type
|
||||
extent_int( const iType & r ) const
|
||||
{ return static_cast<int>(d_view.extent(r)); }
|
||||
|
||||
/* \brief return size of dimension 0 */
|
||||
size_t dimension_0() const {return d_view.dimension_0();}
|
||||
size_t dimension_0() const {return d_view.extent(0);}
|
||||
/* \brief return size of dimension 1 */
|
||||
size_t dimension_1() const {return d_view.dimension_1();}
|
||||
size_t dimension_1() const {return d_view.extent(1);}
|
||||
/* \brief return size of dimension 2 */
|
||||
size_t dimension_2() const {return d_view.dimension_2();}
|
||||
size_t dimension_2() const {return d_view.extent(2);}
|
||||
/* \brief return size of dimension 3 */
|
||||
size_t dimension_3() const {return d_view.dimension_3();}
|
||||
size_t dimension_3() const {return d_view.extent(3);}
|
||||
/* \brief return size of dimension 4 */
|
||||
size_t dimension_4() const {return d_view.dimension_4();}
|
||||
size_t dimension_4() const {return d_view.extent(4);}
|
||||
/* \brief return size of dimension 5 */
|
||||
size_t dimension_5() const {return d_view.dimension_5();}
|
||||
size_t dimension_5() const {return d_view.extent(5);}
|
||||
/* \brief return size of dimension 6 */
|
||||
size_t dimension_6() const {return d_view.dimension_6();}
|
||||
size_t dimension_6() const {return d_view.extent(6);}
|
||||
/* \brief return size of dimension 7 */
|
||||
size_t dimension_7() const {return d_view.dimension_7();}
|
||||
size_t dimension_7() const {return d_view.extent(7);}
|
||||
|
||||
//@}
|
||||
};
|
||||
|
|
|
@ -35,16 +35,16 @@
|
|||
// NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE OF THIS
|
||||
// SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.
|
||||
//
|
||||
// Questions? Contact H. Carter Edwards (hcedwar@sandia.gov)
|
||||
// Questions? Contact Christian R. Trott (crtrott@sandia.gov)
|
||||
//
|
||||
// ************************************************************************
|
||||
//@HEADER
|
||||
*/
|
||||
|
||||
/// \file Kokkos_DynRankView.hpp
|
||||
/// \brief Declaration and definition of Kokkos::Experimental::DynRankView.
|
||||
/// \brief Declaration and definition of Kokkos::DynRankView.
|
||||
///
|
||||
/// This header file declares and defines Kokkos::Experimental::DynRankView and its
|
||||
/// This header file declares and defines Kokkos::DynRankView and its
|
||||
/// related nonmember functions.
|
||||
|
||||
#ifndef KOKKOS_DYNRANKVIEW_HPP
|
||||
|
@ -55,7 +55,6 @@
|
|||
#include <type_traits>
|
||||
|
||||
namespace Kokkos {
|
||||
namespace Experimental {
|
||||
|
||||
template< typename DataType , class ... Properties >
|
||||
class DynRankView; //forward declare
|
||||
|
@ -156,7 +155,7 @@ struct DynRankDimTraits {
|
|||
// Extra overload to match that for specialize types
|
||||
template <typename Traits, typename ... P>
|
||||
KOKKOS_INLINE_FUNCTION
|
||||
static typename std::enable_if< (std::is_same<typename Traits::array_layout , Kokkos::LayoutRight>::value || std::is_same<typename Traits::array_layout , Kokkos::LayoutLeft>::value || std::is_same<typename Traits::array_layout , Kokkos::LayoutStride>::value) , typename Traits::array_layout >::type createLayout( const ViewCtorProp<P...>& prop, const typename Traits::array_layout& layout )
|
||||
static typename std::enable_if< (std::is_same<typename Traits::array_layout , Kokkos::LayoutRight>::value || std::is_same<typename Traits::array_layout , Kokkos::LayoutLeft>::value || std::is_same<typename Traits::array_layout , Kokkos::LayoutStride>::value) , typename Traits::array_layout >::type createLayout( const Kokkos::Impl::ViewCtorProp<P...>& prop, const typename Traits::array_layout& layout )
|
||||
{
|
||||
return createLayout( layout );
|
||||
}
|
||||
|
@ -318,7 +317,6 @@ void dyn_rank_view_verify_operator_bounds
|
|||
struct ViewToDynRankViewTag {};
|
||||
|
||||
} // namespace Impl
|
||||
} // namespace Experimental
|
||||
|
||||
namespace Impl {
|
||||
|
||||
|
@ -348,7 +346,7 @@ class ViewMapping< DstTraits , SrcTraits ,
|
|||
)
|
||||
)
|
||||
)
|
||||
) , Kokkos::Experimental::Impl::ViewToDynRankViewTag >::type >
|
||||
) , Kokkos::Impl::ViewToDynRankViewTag >::type >
|
||||
{
|
||||
private:
|
||||
|
||||
|
@ -375,7 +373,7 @@ public:
|
|||
|
||||
template < typename DT , typename ... DP , typename ST , typename ... SP >
|
||||
KOKKOS_INLINE_FUNCTION
|
||||
static void assign( Kokkos::Experimental::DynRankView< DT , DP...> & dst , const Kokkos::View< ST , SP... > & src )
|
||||
static void assign( Kokkos::DynRankView< DT , DP...> & dst , const Kokkos::View< ST , SP... > & src )
|
||||
{
|
||||
static_assert( is_assignable_value_type
|
||||
, "View assignment must have same value type or const = non-const" );
|
||||
|
@ -395,8 +393,6 @@ public:
|
|||
|
||||
} //end Impl
|
||||
|
||||
namespace Experimental {
|
||||
|
||||
/* \class DynRankView
|
||||
* \brief Container that creates a Kokkos view with rank determined at runtime.
|
||||
* Essentially this is a rank 7 view
|
||||
|
@ -415,7 +411,7 @@ namespace Experimental {
|
|||
template< class > struct is_dyn_rank_view : public std::false_type {};
|
||||
|
||||
template< class D, class ... P >
|
||||
struct is_dyn_rank_view< Kokkos::Experimental::DynRankView<D,P...> > : public std::true_type {};
|
||||
struct is_dyn_rank_view< Kokkos::DynRankView<D,P...> > : public std::true_type {};
|
||||
|
||||
|
||||
template< typename DataType , class ... Properties >
|
||||
|
@ -425,7 +421,7 @@ class DynRankView : public ViewTraits< DataType , Properties ... >
|
|||
|
||||
private:
|
||||
template < class , class ... > friend class DynRankView ;
|
||||
template < class , class ... > friend class Impl::ViewMapping ;
|
||||
template < class , class ... > friend class Kokkos::Impl::ViewMapping ;
|
||||
|
||||
public:
|
||||
typedef ViewTraits< DataType , Properties ... > drvtraits ;
|
||||
|
@ -437,7 +433,7 @@ public:
|
|||
|
||||
private:
|
||||
typedef Kokkos::Impl::ViewMapping< traits , void > map_type ;
|
||||
typedef Kokkos::Experimental::Impl::SharedAllocationTracker track_type ;
|
||||
typedef Kokkos::Impl::SharedAllocationTracker track_type ;
|
||||
|
||||
track_type m_track ;
|
||||
map_type m_map ;
|
||||
|
@ -601,7 +597,7 @@ private:
|
|||
// rank of the calling operator - included as first argument in ARG
|
||||
#define KOKKOS_IMPL_VIEW_OPERATOR_VERIFY( ARG ) \
|
||||
DynRankView::template verify_space< Kokkos::Impl::ActiveExecutionMemorySpace >::check(); \
|
||||
Kokkos::Experimental::Impl::dyn_rank_view_verify_operator_bounds< typename traits::memory_space > ARG ;
|
||||
Kokkos::Impl::dyn_rank_view_verify_operator_bounds< typename traits::memory_space > ARG ;
|
||||
|
||||
#else
|
||||
|
||||
|
@ -778,6 +774,140 @@ public:
|
|||
return m_map.reference(i0,i1,i2,i3,i4,i5,i6);
|
||||
}
|
||||
|
||||
// Rank 0
|
||||
KOKKOS_INLINE_FUNCTION
|
||||
reference_type access() const
|
||||
{
|
||||
KOKKOS_IMPL_VIEW_OPERATOR_VERIFY( (0 , this->rank(), m_track, m_map) )
|
||||
return implementation_map().reference();
|
||||
//return m_map.reference(0,0,0,0,0,0,0);
|
||||
}
|
||||
|
||||
// Rank 1
|
||||
// Rank 1 parenthesis
|
||||
template< typename iType >
|
||||
KOKKOS_INLINE_FUNCTION
|
||||
typename std::enable_if< (std::is_same<typename traits::specialize , void>::value && std::is_integral<iType>::value), reference_type>::type
|
||||
access(const iType & i0 ) const
|
||||
{
|
||||
KOKKOS_IMPL_VIEW_OPERATOR_VERIFY( (1 , this->rank(), m_track, m_map, i0) )
|
||||
return m_map.reference(i0);
|
||||
}
|
||||
|
||||
template< typename iType >
|
||||
KOKKOS_INLINE_FUNCTION
|
||||
typename std::enable_if< !(std::is_same<typename traits::specialize , void>::value && std::is_integral<iType>::value), reference_type>::type
|
||||
access(const iType & i0 ) const
|
||||
{
|
||||
KOKKOS_IMPL_VIEW_OPERATOR_VERIFY( (1 , this->rank(), m_track, m_map, i0) )
|
||||
return m_map.reference(i0,0,0,0,0,0,0);
|
||||
}
|
||||
|
||||
// Rank 2
|
||||
template< typename iType0 , typename iType1 >
|
||||
KOKKOS_INLINE_FUNCTION
|
||||
typename std::enable_if< (std::is_same<typename traits::specialize , void>::value && std::is_integral<iType0>::value && std::is_integral<iType1>::value), reference_type>::type
|
||||
access(const iType0 & i0 , const iType1 & i1 ) const
|
||||
{
|
||||
KOKKOS_IMPL_VIEW_OPERATOR_VERIFY( (2 , this->rank(), m_track, m_map, i0, i1) )
|
||||
return m_map.reference(i0,i1);
|
||||
}
|
||||
|
||||
template< typename iType0 , typename iType1 >
|
||||
KOKKOS_INLINE_FUNCTION
|
||||
typename std::enable_if< !(std::is_same<typename drvtraits::specialize , void>::value && std::is_integral<iType0>::value), reference_type>::type
|
||||
access(const iType0 & i0 , const iType1 & i1 ) const
|
||||
{
|
||||
KOKKOS_IMPL_VIEW_OPERATOR_VERIFY( (2 , this->rank(), m_track, m_map, i0, i1) )
|
||||
return m_map.reference(i0,i1,0,0,0,0,0);
|
||||
}
|
||||
|
||||
// Rank 3
|
||||
template< typename iType0 , typename iType1 , typename iType2 >
|
||||
KOKKOS_INLINE_FUNCTION
|
||||
typename std::enable_if< (std::is_same<typename traits::specialize , void>::value && std::is_integral<iType0>::value && std::is_integral<iType1>::value && std::is_integral<iType2>::value), reference_type>::type
|
||||
access(const iType0 & i0 , const iType1 & i1 , const iType2 & i2 ) const
|
||||
{
|
||||
KOKKOS_IMPL_VIEW_OPERATOR_VERIFY( (3 , this->rank(), m_track, m_map, i0, i1, i2) )
|
||||
return m_map.reference(i0,i1,i2);
|
||||
}
|
||||
|
||||
template< typename iType0 , typename iType1 , typename iType2 >
|
||||
KOKKOS_INLINE_FUNCTION
|
||||
typename std::enable_if< !(std::is_same<typename drvtraits::specialize , void>::value && std::is_integral<iType0>::value), reference_type>::type
|
||||
access(const iType0 & i0 , const iType1 & i1 , const iType2 & i2 ) const
|
||||
{
|
||||
KOKKOS_IMPL_VIEW_OPERATOR_VERIFY( (3 , this->rank(), m_track, m_map, i0, i1, i2) )
|
||||
return m_map.reference(i0,i1,i2,0,0,0,0);
|
||||
}
|
||||
|
||||
// Rank 4
|
||||
template< typename iType0 , typename iType1 , typename iType2 , typename iType3 >
|
||||
KOKKOS_INLINE_FUNCTION
|
||||
typename std::enable_if< (std::is_same<typename traits::specialize , void>::value && std::is_integral<iType0>::value && std::is_integral<iType1>::value && std::is_integral<iType2>::value && std::is_integral<iType3>::value), reference_type>::type
|
||||
access(const iType0 & i0 , const iType1 & i1 , const iType2 & i2 , const iType3 & i3 ) const
|
||||
{
|
||||
KOKKOS_IMPL_VIEW_OPERATOR_VERIFY( (4 , this->rank(), m_track, m_map, i0, i1, i2, i3) )
|
||||
return m_map.reference(i0,i1,i2,i3);
|
||||
}
|
||||
|
||||
template< typename iType0 , typename iType1 , typename iType2 , typename iType3 >
|
||||
KOKKOS_INLINE_FUNCTION
|
||||
typename std::enable_if< !(std::is_same<typename drvtraits::specialize , void>::value && std::is_integral<iType0>::value), reference_type>::type
|
||||
access(const iType0 & i0 , const iType1 & i1 , const iType2 & i2 , const iType3 & i3 ) const
|
||||
{
|
||||
KOKKOS_IMPL_VIEW_OPERATOR_VERIFY( (4 , this->rank(), m_track, m_map, i0, i1, i2, i3) )
|
||||
return m_map.reference(i0,i1,i2,i3,0,0,0);
|
||||
}
|
||||
|
||||
// Rank 5
|
||||
template< typename iType0 , typename iType1 , typename iType2 , typename iType3, typename iType4 >
|
||||
KOKKOS_INLINE_FUNCTION
|
||||
typename std::enable_if< (std::is_same<typename traits::specialize , void>::value && std::is_integral<iType0>::value && std::is_integral<iType1>::value && std::is_integral<iType2>::value && std::is_integral<iType3>::value && std::is_integral<iType4>::value), reference_type>::type
|
||||
access(const iType0 & i0 , const iType1 & i1 , const iType2 & i2 , const iType3 & i3 , const iType4 & i4 ) const
|
||||
{
|
||||
KOKKOS_IMPL_VIEW_OPERATOR_VERIFY( (5 , this->rank(), m_track, m_map, i0, i1, i2, i3, i4) )
|
||||
return m_map.reference(i0,i1,i2,i3,i4);
|
||||
}
|
||||
|
||||
template< typename iType0 , typename iType1 , typename iType2 , typename iType3, typename iType4 >
|
||||
KOKKOS_INLINE_FUNCTION
|
||||
typename std::enable_if< !(std::is_same<typename drvtraits::specialize , void>::value && std::is_integral<iType0>::value), reference_type>::type
|
||||
access(const iType0 & i0 , const iType1 & i1 , const iType2 & i2 , const iType3 & i3 , const iType4 & i4 ) const
|
||||
{
|
||||
KOKKOS_IMPL_VIEW_OPERATOR_VERIFY( (5 , this->rank(), m_track, m_map, i0, i1, i2, i3, i4) )
|
||||
return m_map.reference(i0,i1,i2,i3,i4,0,0);
|
||||
}
|
||||
|
||||
// Rank 6
|
||||
template< typename iType0 , typename iType1 , typename iType2 , typename iType3, typename iType4 , typename iType5 >
|
||||
KOKKOS_INLINE_FUNCTION
|
||||
typename std::enable_if< (std::is_same<typename traits::specialize , void>::value && std::is_integral<iType0>::value && std::is_integral<iType1>::value && std::is_integral<iType2>::value && std::is_integral<iType3>::value && std::is_integral<iType4>::value && std::is_integral<iType5>::value), reference_type>::type
|
||||
access(const iType0 & i0 , const iType1 & i1 , const iType2 & i2 , const iType3 & i3 , const iType4 & i4 , const iType5 & i5 ) const
|
||||
{
|
||||
KOKKOS_IMPL_VIEW_OPERATOR_VERIFY( (6 , this->rank(), m_track, m_map, i0, i1, i2, i3, i4, i5) )
|
||||
return m_map.reference(i0,i1,i2,i3,i4,i5);
|
||||
}
|
||||
|
||||
template< typename iType0 , typename iType1 , typename iType2 , typename iType3, typename iType4 , typename iType5 >
|
||||
KOKKOS_INLINE_FUNCTION
|
||||
typename std::enable_if< !(std::is_same<typename drvtraits::specialize , void>::value && std::is_integral<iType0>::value), reference_type>::type
|
||||
access(const iType0 & i0 , const iType1 & i1 , const iType2 & i2 , const iType3 & i3 , const iType4 & i4 , const iType5 & i5 ) const
|
||||
{
|
||||
KOKKOS_IMPL_VIEW_OPERATOR_VERIFY( (6 , this->rank(), m_track, m_map, i0, i1, i2, i3, i4, i5) )
|
||||
return m_map.reference(i0,i1,i2,i3,i4,i5,0);
|
||||
}
|
||||
|
||||
// Rank 7
|
||||
template< typename iType0 , typename iType1 , typename iType2 , typename iType3, typename iType4 , typename iType5 , typename iType6 >
|
||||
KOKKOS_INLINE_FUNCTION
|
||||
typename std::enable_if< (std::is_integral<iType0>::value && std::is_integral<iType1>::value && std::is_integral<iType2>::value && std::is_integral<iType3>::value && std::is_integral<iType4>::value && std::is_integral<iType5>::value && std::is_integral<iType6>::value), reference_type>::type
|
||||
access(const iType0 & i0 , const iType1 & i1 , const iType2 & i2 , const iType3 & i3 , const iType4 & i4 , const iType5 & i5 , const iType6 & i6 ) const
|
||||
{
|
||||
KOKKOS_IMPL_VIEW_OPERATOR_VERIFY( (7 , this->rank(), m_track, m_map, i0, i1, i2, i3, i4, i5, i6) )
|
||||
return m_map.reference(i0,i1,i2,i3,i4,i5,i6);
|
||||
}
|
||||
|
||||
#undef KOKKOS_IMPL_VIEW_OPERATOR_VERIFY
|
||||
|
||||
//----------------------------------------
|
||||
|
@ -830,7 +960,6 @@ public:
|
|||
return *this;
|
||||
}
|
||||
|
||||
// Experimental
|
||||
// Copy/Assign View to DynRankView
|
||||
template< class RT , class ... RP >
|
||||
KOKKOS_INLINE_FUNCTION
|
||||
|
@ -840,7 +969,7 @@ public:
|
|||
, m_rank( rhs.Rank )
|
||||
{
|
||||
typedef typename View<RT,RP...>::traits SrcTraits ;
|
||||
typedef Kokkos::Impl::ViewMapping< traits , SrcTraits , Kokkos::Experimental::Impl::ViewToDynRankViewTag > Mapping ;
|
||||
typedef Kokkos::Impl::ViewMapping< traits , SrcTraits , Kokkos::Impl::ViewToDynRankViewTag > Mapping ;
|
||||
static_assert( Mapping::is_assignable , "Incompatible DynRankView copy construction" );
|
||||
Mapping::assign( *this , rhs );
|
||||
}
|
||||
|
@ -850,7 +979,7 @@ public:
|
|||
DynRankView & operator = ( const View<RT,RP...> & rhs )
|
||||
{
|
||||
typedef typename View<RT,RP...>::traits SrcTraits ;
|
||||
typedef Kokkos::Impl::ViewMapping< traits , SrcTraits , Kokkos::Experimental::Impl::ViewToDynRankViewTag > Mapping ;
|
||||
typedef Kokkos::Impl::ViewMapping< traits , SrcTraits , Kokkos::Impl::ViewToDynRankViewTag > Mapping ;
|
||||
static_assert( Mapping::is_assignable , "Incompatible View to DynRankView copy assignment" );
|
||||
Mapping::assign( *this , rhs );
|
||||
return *this ;
|
||||
|
@ -872,8 +1001,8 @@ public:
|
|||
// unused arg_layout dimensions must be set to ~size_t(0) so that rank deduction can properly take place
|
||||
template< class ... P >
|
||||
explicit inline
|
||||
DynRankView( const Impl::ViewCtorProp< P ... > & arg_prop
|
||||
, typename std::enable_if< ! Impl::ViewCtorProp< P... >::has_pointer
|
||||
DynRankView( const Kokkos::Impl::ViewCtorProp< P ... > & arg_prop
|
||||
, typename std::enable_if< ! Kokkos::Impl::ViewCtorProp< P... >::has_pointer
|
||||
, typename traits::array_layout
|
||||
>::type const & arg_layout
|
||||
)
|
||||
|
@ -882,11 +1011,11 @@ public:
|
|||
, m_rank( Impl::DynRankDimTraits<typename traits::specialize>::template computeRank< typename traits::array_layout, P...>(arg_prop, arg_layout) )
|
||||
{
|
||||
// Append layout and spaces if not input
|
||||
typedef Impl::ViewCtorProp< P ... > alloc_prop_input ;
|
||||
typedef Kokkos::Impl::ViewCtorProp< P ... > alloc_prop_input ;
|
||||
|
||||
// use 'std::integral_constant<unsigned,I>' for non-types
|
||||
// to avoid duplicate class error.
|
||||
typedef Impl::ViewCtorProp
|
||||
typedef Kokkos::Impl::ViewCtorProp
|
||||
< P ...
|
||||
, typename std::conditional
|
||||
< alloc_prop_input::has_label
|
||||
|
@ -931,7 +1060,7 @@ public:
|
|||
#endif
|
||||
//------------------------------------------------------------
|
||||
|
||||
Kokkos::Experimental::Impl::SharedAllocationRecord<> *
|
||||
Kokkos::Impl::SharedAllocationRecord<> *
|
||||
record = m_map.allocate_shared( prop , Impl::DynRankDimTraits<typename traits::specialize>::template createLayout<traits, P...>(arg_prop, arg_layout) );
|
||||
|
||||
//------------------------------------------------------------
|
||||
|
@ -950,8 +1079,8 @@ public:
|
|||
// Wrappers
|
||||
template< class ... P >
|
||||
explicit KOKKOS_INLINE_FUNCTION
|
||||
DynRankView( const Impl::ViewCtorProp< P ... > & arg_prop
|
||||
, typename std::enable_if< Impl::ViewCtorProp< P... >::has_pointer
|
||||
DynRankView( const Kokkos::Impl::ViewCtorProp< P ... > & arg_prop
|
||||
, typename std::enable_if< Kokkos::Impl::ViewCtorProp< P... >::has_pointer
|
||||
, typename traits::array_layout
|
||||
>::type const & arg_layout
|
||||
)
|
||||
|
@ -972,8 +1101,8 @@ public:
|
|||
// Simple dimension-only layout
|
||||
template< class ... P >
|
||||
explicit inline
|
||||
DynRankView( const Impl::ViewCtorProp< P ... > & arg_prop
|
||||
, typename std::enable_if< ! Impl::ViewCtorProp< P... >::has_pointer
|
||||
DynRankView( const Kokkos::Impl::ViewCtorProp< P ... > & arg_prop
|
||||
, typename std::enable_if< ! Kokkos::Impl::ViewCtorProp< P... >::has_pointer
|
||||
, size_t
|
||||
>::type const arg_N0 = ~size_t(0)
|
||||
, const size_t arg_N1 = ~size_t(0)
|
||||
|
@ -992,8 +1121,8 @@ public:
|
|||
|
||||
template< class ... P >
|
||||
explicit KOKKOS_INLINE_FUNCTION
|
||||
DynRankView( const Impl::ViewCtorProp< P ... > & arg_prop
|
||||
, typename std::enable_if< Impl::ViewCtorProp< P... >::has_pointer
|
||||
DynRankView( const Kokkos::Impl::ViewCtorProp< P ... > & arg_prop
|
||||
, typename std::enable_if< Kokkos::Impl::ViewCtorProp< P... >::has_pointer
|
||||
, size_t
|
||||
>::type const arg_N0 = ~size_t(0)
|
||||
, const size_t arg_N1 = ~size_t(0)
|
||||
|
@ -1015,10 +1144,10 @@ public:
|
|||
explicit inline
|
||||
DynRankView( const Label & arg_label
|
||||
, typename std::enable_if<
|
||||
Kokkos::Experimental::Impl::is_view_label<Label>::value ,
|
||||
Kokkos::Impl::is_view_label<Label>::value ,
|
||||
typename traits::array_layout >::type const & arg_layout
|
||||
)
|
||||
: DynRankView( Impl::ViewCtorProp< std::string >( arg_label ) , arg_layout )
|
||||
: DynRankView( Kokkos::Impl::ViewCtorProp< std::string >( arg_label ) , arg_layout )
|
||||
{}
|
||||
|
||||
// Allocate label and layout, must disambiguate from subview constructor
|
||||
|
@ -1026,7 +1155,7 @@ public:
|
|||
explicit inline
|
||||
DynRankView( const Label & arg_label
|
||||
, typename std::enable_if<
|
||||
Kokkos::Experimental::Impl::is_view_label<Label>::value ,
|
||||
Kokkos::Impl::is_view_label<Label>::value ,
|
||||
const size_t >::type arg_N0 = ~size_t(0)
|
||||
, const size_t arg_N1 = ~size_t(0)
|
||||
, const size_t arg_N2 = ~size_t(0)
|
||||
|
@ -1036,7 +1165,7 @@ public:
|
|||
, const size_t arg_N6 = ~size_t(0)
|
||||
, const size_t arg_N7 = ~size_t(0)
|
||||
)
|
||||
: DynRankView( Impl::ViewCtorProp< std::string >( arg_label )
|
||||
: DynRankView( Kokkos::Impl::ViewCtorProp< std::string >( arg_label )
|
||||
, typename traits::array_layout
|
||||
( arg_N0 , arg_N1 , arg_N2 , arg_N3 , arg_N4 , arg_N5 , arg_N6 , arg_N7 )
|
||||
)
|
||||
|
@ -1048,7 +1177,8 @@ public:
|
|||
DynRankView( const ViewAllocateWithoutInitializing & arg_prop
|
||||
, const typename traits::array_layout & arg_layout
|
||||
)
|
||||
: DynRankView( Impl::ViewCtorProp< std::string , Kokkos::Experimental::Impl::WithoutInitializing_t >( arg_prop.label , Kokkos::Experimental::WithoutInitializing )
|
||||
: DynRankView( Kokkos::Impl::ViewCtorProp< std::string , Kokkos::Impl::WithoutInitializing_t >( arg_prop.label , Kokkos::WithoutInitializing )
|
||||
|
||||
, Impl::DynRankDimTraits<typename traits::specialize>::createLayout(arg_layout)
|
||||
)
|
||||
{}
|
||||
|
@ -1064,7 +1194,7 @@ public:
|
|||
, const size_t arg_N6 = ~size_t(0)
|
||||
, const size_t arg_N7 = ~size_t(0)
|
||||
)
|
||||
: DynRankView(Impl::ViewCtorProp< std::string , Kokkos::Experimental::Impl::WithoutInitializing_t >( arg_prop.label , Kokkos::Experimental::WithoutInitializing ), arg_N0, arg_N1, arg_N2, arg_N3, arg_N4, arg_N5, arg_N6, arg_N7 )
|
||||
: DynRankView(Kokkos::Impl::ViewCtorProp< std::string , Kokkos::Impl::WithoutInitializing_t >( arg_prop.label , Kokkos::WithoutInitializing ), arg_N0, arg_N1, arg_N2, arg_N3, arg_N4, arg_N5, arg_N6, arg_N7 )
|
||||
{}
|
||||
|
||||
//----------------------------------------
|
||||
|
@ -1097,14 +1227,14 @@ public:
|
|||
, const size_t arg_N6 = ~size_t(0)
|
||||
, const size_t arg_N7 = ~size_t(0)
|
||||
)
|
||||
: DynRankView( Impl::ViewCtorProp<pointer_type>(arg_ptr) , arg_N0, arg_N1, arg_N2, arg_N3, arg_N4, arg_N5, arg_N6, arg_N7 )
|
||||
: DynRankView( Kokkos::Impl::ViewCtorProp<pointer_type>(arg_ptr) , arg_N0, arg_N1, arg_N2, arg_N3, arg_N4, arg_N5, arg_N6, arg_N7 )
|
||||
{}
|
||||
|
||||
explicit KOKKOS_INLINE_FUNCTION
|
||||
DynRankView( pointer_type arg_ptr
|
||||
, typename traits::array_layout & arg_layout
|
||||
)
|
||||
: DynRankView( Impl::ViewCtorProp<pointer_type>(arg_ptr) , arg_layout )
|
||||
: DynRankView( Kokkos::Impl::ViewCtorProp<pointer_type>(arg_ptr) , arg_layout )
|
||||
{}
|
||||
|
||||
|
||||
|
@ -1140,7 +1270,7 @@ public:
|
|||
explicit KOKKOS_INLINE_FUNCTION
|
||||
DynRankView( const typename traits::execution_space::scratch_memory_space & arg_space
|
||||
, const typename traits::array_layout & arg_layout )
|
||||
: DynRankView( Impl::ViewCtorProp<pointer_type>(
|
||||
: DynRankView( Kokkos::Impl::ViewCtorProp<pointer_type>(
|
||||
reinterpret_cast<pointer_type>(
|
||||
arg_space.get_shmem( map_type::memory_span(
|
||||
Impl::DynRankDimTraits<typename traits::specialize>::createLayout( arg_layout ) //is this correct?
|
||||
|
@ -1159,7 +1289,7 @@ public:
|
|||
, const size_t arg_N6 = ~size_t(0)
|
||||
, const size_t arg_N7 = ~size_t(0) )
|
||||
|
||||
: DynRankView( Impl::ViewCtorProp<pointer_type>(
|
||||
: DynRankView( Kokkos::Impl::ViewCtorProp<pointer_type>(
|
||||
reinterpret_cast<pointer_type>(
|
||||
arg_space.get_shmem(
|
||||
map_type::memory_span(
|
||||
|
@ -1190,7 +1320,6 @@ namespace Impl {
|
|||
struct DynRankSubviewTag {};
|
||||
|
||||
} // namespace Impl
|
||||
} // namespace Experimental
|
||||
|
||||
namespace Impl {
|
||||
|
||||
|
@ -1207,7 +1336,7 @@ struct ViewMapping
|
|||
std::is_same< typename SrcTraits::array_layout
|
||||
, Kokkos::LayoutStride >::value
|
||||
)
|
||||
), Kokkos::Experimental::Impl::DynRankSubviewTag >::type
|
||||
), Kokkos::Impl::DynRankSubviewTag >::type
|
||||
, SrcTraits
|
||||
, Args ... >
|
||||
{
|
||||
|
@ -1279,11 +1408,11 @@ public:
|
|||
};
|
||||
|
||||
|
||||
typedef Kokkos::Experimental::DynRankView< value_type , array_layout , typename SrcTraits::device_type , typename SrcTraits::memory_traits > ret_type;
|
||||
typedef Kokkos::DynRankView< value_type , array_layout , typename SrcTraits::device_type , typename SrcTraits::memory_traits > ret_type;
|
||||
|
||||
template < typename T , class ... P >
|
||||
KOKKOS_INLINE_FUNCTION
|
||||
static ret_type subview( const unsigned src_rank , Kokkos::Experimental::DynRankView< T , P...> const & src
|
||||
static ret_type subview( const unsigned src_rank , Kokkos::DynRankView< T , P...> const & src
|
||||
, Args ... args )
|
||||
{
|
||||
|
||||
|
@ -1351,20 +1480,19 @@ public:
|
|||
|
||||
} // end Impl
|
||||
|
||||
namespace Experimental {
|
||||
|
||||
template< class V , class ... Args >
|
||||
using Subdynrankview = typename Kokkos::Impl::ViewMapping< Kokkos::Experimental::Impl::DynRankSubviewTag , V , Args... >::ret_type ;
|
||||
using Subdynrankview = typename Kokkos::Impl::ViewMapping< Kokkos::Impl::DynRankSubviewTag , V , Args... >::ret_type ;
|
||||
|
||||
template< class D , class ... P , class ...Args >
|
||||
KOKKOS_INLINE_FUNCTION
|
||||
Subdynrankview< ViewTraits<D******* , P...> , Args... >
|
||||
subdynrankview( const Kokkos::Experimental::DynRankView< D , P... > &src , Args...args)
|
||||
subdynrankview( const Kokkos::DynRankView< D , P... > &src , Args...args)
|
||||
{
|
||||
if ( src.rank() > sizeof...(Args) ) //allow sizeof...(Args) >= src.rank(), ignore the remaining args
|
||||
{ Kokkos::abort("subdynrankview: num of args must be >= rank of the source DynRankView"); }
|
||||
|
||||
typedef Kokkos::Impl::ViewMapping< Kokkos::Experimental::Impl::DynRankSubviewTag , Kokkos::ViewTraits< D*******, P... > , Args... > metafcn ;
|
||||
typedef Kokkos::Impl::ViewMapping< Kokkos::Impl::DynRankSubviewTag , Kokkos::ViewTraits< D*******, P... > , Args... > metafcn ;
|
||||
|
||||
return metafcn::subview( src.rank() , src , args... );
|
||||
}
|
||||
|
@ -1373,16 +1501,14 @@ subdynrankview( const Kokkos::Experimental::DynRankView< D , P... > &src , Args.
|
|||
template< class D , class ... P , class ...Args >
|
||||
KOKKOS_INLINE_FUNCTION
|
||||
Subdynrankview< ViewTraits<D******* , P...> , Args... >
|
||||
subview( const Kokkos::Experimental::DynRankView< D , P... > &src , Args...args)
|
||||
subview( const Kokkos::DynRankView< D , P... > &src , Args...args)
|
||||
{
|
||||
return subdynrankview( src , args... );
|
||||
}
|
||||
|
||||
} // namespace Experimental
|
||||
} // namespace Kokkos
|
||||
|
||||
namespace Kokkos {
|
||||
namespace Experimental {
|
||||
|
||||
// overload == and !=
|
||||
template< class LT , class ... LP , class RT , class ... RP >
|
||||
|
@ -1422,13 +1548,11 @@ bool operator != ( const DynRankView<LT,LP...> & lhs ,
|
|||
return ! ( operator==(lhs,rhs) );
|
||||
}
|
||||
|
||||
} //end Experimental
|
||||
} //end Kokkos
|
||||
|
||||
//----------------------------------------------------------------------------
|
||||
//----------------------------------------------------------------------------
|
||||
namespace Kokkos {
|
||||
namespace Experimental {
|
||||
namespace Impl {
|
||||
|
||||
template< class OutputView , typename Enable = void >
|
||||
|
@ -1455,7 +1579,7 @@ struct DynRankViewFill {
|
|||
for ( size_t i4 = 0 ; i4 < n4 ; ++i4 ) {
|
||||
for ( size_t i5 = 0 ; i5 < n5 ; ++i5 ) {
|
||||
for ( size_t i6 = 0 ; i6 < n6 ; ++i6 ) {
|
||||
output(i0,i1,i2,i3,i4,i5,i6) = input ;
|
||||
output.access(i0,i1,i2,i3,i4,i5,i6) = input ;
|
||||
}}}}}}
|
||||
}
|
||||
|
||||
|
@ -1498,14 +1622,14 @@ struct DynRankViewRemap {
|
|||
|
||||
DynRankViewRemap( const OutputView & arg_out , const InputView & arg_in )
|
||||
: output( arg_out ), input( arg_in )
|
||||
, n0( std::min( (size_t)arg_out.dimension_0() , (size_t)arg_in.dimension_0() ) )
|
||||
, n1( std::min( (size_t)arg_out.dimension_1() , (size_t)arg_in.dimension_1() ) )
|
||||
, n2( std::min( (size_t)arg_out.dimension_2() , (size_t)arg_in.dimension_2() ) )
|
||||
, n3( std::min( (size_t)arg_out.dimension_3() , (size_t)arg_in.dimension_3() ) )
|
||||
, n4( std::min( (size_t)arg_out.dimension_4() , (size_t)arg_in.dimension_4() ) )
|
||||
, n5( std::min( (size_t)arg_out.dimension_5() , (size_t)arg_in.dimension_5() ) )
|
||||
, n6( std::min( (size_t)arg_out.dimension_6() , (size_t)arg_in.dimension_6() ) )
|
||||
, n7( std::min( (size_t)arg_out.dimension_7() , (size_t)arg_in.dimension_7() ) )
|
||||
, n0( std::min( (size_t)arg_out.extent(0) , (size_t)arg_in.extent(0) ) )
|
||||
, n1( std::min( (size_t)arg_out.extent(1) , (size_t)arg_in.extent(1) ) )
|
||||
, n2( std::min( (size_t)arg_out.extent(2) , (size_t)arg_in.extent(2) ) )
|
||||
, n3( std::min( (size_t)arg_out.extent(3) , (size_t)arg_in.extent(3) ) )
|
||||
, n4( std::min( (size_t)arg_out.extent(4) , (size_t)arg_in.extent(4) ) )
|
||||
, n5( std::min( (size_t)arg_out.extent(5) , (size_t)arg_in.extent(5) ) )
|
||||
, n6( std::min( (size_t)arg_out.extent(6) , (size_t)arg_in.extent(6) ) )
|
||||
, n7( std::min( (size_t)arg_out.extent(7) , (size_t)arg_in.extent(7) ) )
|
||||
{
|
||||
typedef Kokkos::RangePolicy< ExecSpace > Policy ;
|
||||
const Kokkos::Impl::ParallelFor< DynRankViewRemap , Policy > closure( *this , Policy( 0 , n0 ) );
|
||||
|
@ -1521,18 +1645,16 @@ struct DynRankViewRemap {
|
|||
for ( size_t i4 = 0 ; i4 < n4 ; ++i4 ) {
|
||||
for ( size_t i5 = 0 ; i5 < n5 ; ++i5 ) {
|
||||
for ( size_t i6 = 0 ; i6 < n6 ; ++i6 ) {
|
||||
output(i0,i1,i2,i3,i4,i5,i6) = input(i0,i1,i2,i3,i4,i5,i6);
|
||||
output.access(i0,i1,i2,i3,i4,i5,i6) = input.access(i0,i1,i2,i3,i4,i5,i6);
|
||||
}}}}}}
|
||||
}
|
||||
};
|
||||
|
||||
} /* namespace Impl */
|
||||
} /* namespace Experimental */
|
||||
} /* namespace Kokkos */
|
||||
|
||||
|
||||
namespace Kokkos {
|
||||
namespace Experimental {
|
||||
|
||||
/** \brief Deep copy a value from Host memory into a view. */
|
||||
template< class DT , class ... DP >
|
||||
|
@ -1549,7 +1671,7 @@ void deep_copy
|
|||
typename ViewTraits<DT,DP...>::value_type >::value
|
||||
, "deep_copy requires non-const type" );
|
||||
|
||||
Kokkos::Experimental::Impl::DynRankViewFill< DynRankView<DT,DP...> >( dst , value );
|
||||
Kokkos::Impl::DynRankViewFill< DynRankView<DT,DP...> >( dst , value );
|
||||
}
|
||||
|
||||
/** \brief Deep copy into a value in Host memory from a view. */
|
||||
|
@ -1585,7 +1707,7 @@ void deep_copy
|
|||
std::is_same< typename DstType::traits::specialize , void >::value &&
|
||||
std::is_same< typename SrcType::traits::specialize , void >::value
|
||||
&&
|
||||
( Kokkos::Experimental::is_dyn_rank_view<DstType>::value || Kokkos::Experimental::is_dyn_rank_view<SrcType>::value)
|
||||
( Kokkos::is_dyn_rank_view<DstType>::value || Kokkos::is_dyn_rank_view<SrcType>::value)
|
||||
)>::type * = 0 )
|
||||
{
|
||||
static_assert(
|
||||
|
@ -1641,14 +1763,15 @@ void deep_copy
|
|||
dst.span_is_contiguous() &&
|
||||
src.span_is_contiguous() &&
|
||||
dst.span() == src.span() &&
|
||||
dst.dimension_0() == src.dimension_0() &&
|
||||
dst.dimension_1() == src.dimension_1() &&
|
||||
dst.dimension_2() == src.dimension_2() &&
|
||||
dst.dimension_3() == src.dimension_3() &&
|
||||
dst.dimension_4() == src.dimension_4() &&
|
||||
dst.dimension_5() == src.dimension_5() &&
|
||||
dst.dimension_6() == src.dimension_6() &&
|
||||
dst.dimension_7() == src.dimension_7() ) {
|
||||
dst.extent(0) == src.extent(0) &&
|
||||
|
||||
dst.extent(1) == src.extent(1) &&
|
||||
dst.extent(2) == src.extent(2) &&
|
||||
dst.extent(3) == src.extent(3) &&
|
||||
dst.extent(4) == src.extent(4) &&
|
||||
dst.extent(5) == src.extent(5) &&
|
||||
dst.extent(6) == src.extent(6) &&
|
||||
dst.extent(7) == src.extent(7) ) {
|
||||
|
||||
const size_t nbytes = sizeof(typename dst_type::value_type) * dst.span();
|
||||
|
||||
|
@ -1673,14 +1796,14 @@ void deep_copy
|
|||
dst.span_is_contiguous() &&
|
||||
src.span_is_contiguous() &&
|
||||
dst.span() == src.span() &&
|
||||
dst.dimension_0() == src.dimension_0() &&
|
||||
dst.dimension_1() == src.dimension_1() &&
|
||||
dst.dimension_2() == src.dimension_2() &&
|
||||
dst.dimension_3() == src.dimension_3() &&
|
||||
dst.dimension_4() == src.dimension_4() &&
|
||||
dst.dimension_5() == src.dimension_5() &&
|
||||
dst.dimension_6() == src.dimension_6() &&
|
||||
dst.dimension_7() == src.dimension_7() &&
|
||||
dst.extent(0) == src.extent(0) &&
|
||||
dst.extent(1) == src.extent(1) &&
|
||||
dst.extent(2) == src.extent(2) &&
|
||||
dst.extent(3) == src.extent(3) &&
|
||||
dst.extent(4) == src.extent(4) &&
|
||||
dst.extent(5) == src.extent(5) &&
|
||||
dst.extent(6) == src.extent(6) &&
|
||||
dst.extent(7) == src.extent(7) &&
|
||||
dst.stride_0() == src.stride_0() &&
|
||||
dst.stride_1() == src.stride_1() &&
|
||||
dst.stride_2() == src.stride_2() &&
|
||||
|
@ -1697,11 +1820,11 @@ void deep_copy
|
|||
}
|
||||
else if ( DstExecCanAccessSrc ) {
|
||||
// Copying data between views in accessible memory spaces and either non-contiguous or incompatible shape.
|
||||
Kokkos::Experimental::Impl::DynRankViewRemap< dst_type , src_type >( dst , src );
|
||||
Kokkos::Impl::DynRankViewRemap< dst_type , src_type >( dst , src );
|
||||
}
|
||||
else if ( SrcExecCanAccessDst ) {
|
||||
// Copying data between views in accessible memory spaces and either non-contiguous or incompatible shape.
|
||||
Kokkos::Experimental::Impl::DynRankViewRemap< dst_type , src_type , src_execution_space >( dst , src );
|
||||
Kokkos::Impl::DynRankViewRemap< dst_type , src_type , src_execution_space >( dst , src );
|
||||
}
|
||||
else {
|
||||
Kokkos::Impl::throw_runtime_exception("deep_copy given views that would require a temporary allocation");
|
||||
|
@ -1709,7 +1832,6 @@ void deep_copy
|
|||
}
|
||||
}
|
||||
|
||||
} //end Experimental
|
||||
} //end Kokkos
|
||||
|
||||
|
||||
|
@ -1717,8 +1839,6 @@ void deep_copy
|
|||
//----------------------------------------------------------------------------
|
||||
|
||||
namespace Kokkos {
|
||||
namespace Experimental {
|
||||
|
||||
namespace Impl {
|
||||
|
||||
|
||||
|
@ -1726,7 +1846,7 @@ namespace Impl {
|
|||
template<class Space, class T, class ... P>
|
||||
struct MirrorDRViewType {
|
||||
// The incoming view_type
|
||||
typedef typename Kokkos::Experimental::DynRankView<T,P...> src_view_type;
|
||||
typedef typename Kokkos::DynRankView<T,P...> src_view_type;
|
||||
// The memory space for the mirror view
|
||||
typedef typename Space::memory_space memory_space;
|
||||
// Check whether it is the same memory space
|
||||
|
@ -1736,7 +1856,7 @@ struct MirrorDRViewType {
|
|||
// The data type (we probably want it non-const since otherwise we can't even deep_copy to it.
|
||||
typedef typename src_view_type::non_const_data_type data_type;
|
||||
// The destination view type if it is not the same memory space
|
||||
typedef Kokkos::Experimental::DynRankView<data_type,array_layout,Space> dest_view_type;
|
||||
typedef Kokkos::DynRankView<data_type,array_layout,Space> dest_view_type;
|
||||
// If it is the same memory_space return the existsing view_type
|
||||
// This will also keep the unmanaged trait if necessary
|
||||
typedef typename std::conditional<is_same_memspace,src_view_type,dest_view_type>::type view_type;
|
||||
|
@ -1745,7 +1865,7 @@ struct MirrorDRViewType {
|
|||
template<class Space, class T, class ... P>
|
||||
struct MirrorDRVType {
|
||||
// The incoming view_type
|
||||
typedef typename Kokkos::Experimental::DynRankView<T,P...> src_view_type;
|
||||
typedef typename Kokkos::DynRankView<T,P...> src_view_type;
|
||||
// The memory space for the mirror view
|
||||
typedef typename Space::memory_space memory_space;
|
||||
// Check whether it is the same memory space
|
||||
|
@ -1755,12 +1875,11 @@ struct MirrorDRVType {
|
|||
// The data type (we probably want it non-const since otherwise we can't even deep_copy to it.
|
||||
typedef typename src_view_type::non_const_data_type data_type;
|
||||
// The destination view type if it is not the same memory space
|
||||
typedef Kokkos::Experimental::DynRankView<data_type,array_layout,Space> view_type;
|
||||
typedef Kokkos::DynRankView<data_type,array_layout,Space> view_type;
|
||||
};
|
||||
|
||||
}
|
||||
|
||||
|
||||
template< class T , class ... P >
|
||||
inline
|
||||
typename DynRankView<T,P...>::HostMirror
|
||||
|
@ -1799,7 +1918,7 @@ create_mirror( const DynRankView<T,P...> & src
|
|||
|
||||
// Create a mirror in a new space (specialization for different space)
|
||||
template<class Space, class T, class ... P>
|
||||
typename Impl::MirrorDRVType<Space,T,P ...>::view_type create_mirror(const Space& , const Kokkos::Experimental::DynRankView<T,P...> & src) {
|
||||
typename Impl::MirrorDRVType<Space,T,P ...>::view_type create_mirror(const Space& , const Kokkos::DynRankView<T,P...> & src) {
|
||||
return typename Impl::MirrorDRVType<Space,T,P ...>::view_type(src.label(), Impl::reconstructLayout(src.layout(), src.rank()) );
|
||||
}
|
||||
|
||||
|
@ -1836,13 +1955,13 @@ create_mirror_view( const DynRankView<T,P...> & src
|
|||
)>::type * = 0
|
||||
)
|
||||
{
|
||||
return Kokkos::Experimental::create_mirror( src );
|
||||
return Kokkos::create_mirror( src );
|
||||
}
|
||||
|
||||
// Create a mirror view in a new space (specialization for same space)
|
||||
template<class Space, class T, class ... P>
|
||||
typename Impl::MirrorDRViewType<Space,T,P ...>::view_type
|
||||
create_mirror_view(const Space& , const Kokkos::Experimental::DynRankView<T,P...> & src
|
||||
create_mirror_view(const Space& , const Kokkos::DynRankView<T,P...> & src
|
||||
, typename std::enable_if<Impl::MirrorDRViewType<Space,T,P ...>::is_same_memspace>::type* = 0 ) {
|
||||
return src;
|
||||
}
|
||||
|
@ -1850,12 +1969,11 @@ create_mirror_view(const Space& , const Kokkos::Experimental::DynRankView<T,P...
|
|||
// Create a mirror view in a new space (specialization for different space)
|
||||
template<class Space, class T, class ... P>
|
||||
typename Impl::MirrorDRViewType<Space,T,P ...>::view_type
|
||||
create_mirror_view(const Space& , const Kokkos::Experimental::DynRankView<T,P...> & src
|
||||
create_mirror_view(const Space& , const Kokkos::DynRankView<T,P...> & src
|
||||
, typename std::enable_if<!Impl::MirrorDRViewType<Space,T,P ...>::is_same_memspace>::type* = 0 ) {
|
||||
return typename Impl::MirrorDRViewType<Space,T,P ...>::view_type(src.label(), Impl::reconstructLayout(src.layout(), src.rank()) );
|
||||
}
|
||||
|
||||
} //end Experimental
|
||||
} //end Kokkos
|
||||
|
||||
|
||||
|
@ -1863,7 +1981,6 @@ create_mirror_view(const Space& , const Kokkos::Experimental::DynRankView<T,P...
|
|||
//----------------------------------------------------------------------------
|
||||
|
||||
namespace Kokkos {
|
||||
namespace Experimental {
|
||||
/** \brief Resize a view with copying old data to new data at the corresponding indices. */
|
||||
template< class T , class ... P >
|
||||
inline
|
||||
|
@ -1877,13 +1994,13 @@ void resize( DynRankView<T,P...> & v ,
|
|||
const size_t n6 = ~size_t(0) ,
|
||||
const size_t n7 = ~size_t(0) )
|
||||
{
|
||||
typedef DynRankView<T,P...> drview_type ;
|
||||
typedef DynRankView<T,P...> drview_type ;
|
||||
|
||||
static_assert( Kokkos::ViewTraits<T,P...>::is_managed , "Can only resize managed views" );
|
||||
|
||||
drview_type v_resized( v.label(), n0, n1, n2, n3, n4, n5, n6 );
|
||||
|
||||
Kokkos::Experimental::Impl::DynRankViewRemap< drview_type , drview_type >( v_resized, v );
|
||||
Kokkos::Impl::DynRankViewRemap< drview_type , drview_type >( v_resized, v );
|
||||
|
||||
v = v_resized ;
|
||||
}
|
||||
|
@ -1911,25 +2028,7 @@ void realloc( DynRankView<T,P...> & v ,
|
|||
v = drview_type( label, n0, n1, n2, n3, n4, n5, n6 );
|
||||
}
|
||||
|
||||
} //end Experimental
|
||||
|
||||
} //end Kokkos
|
||||
|
||||
using Kokkos::Experimental::is_dyn_rank_view ;
|
||||
|
||||
namespace Kokkos {
|
||||
|
||||
template< typename D , class ... P >
|
||||
using DynRankView = Kokkos::Experimental::DynRankView< D , P... > ;
|
||||
|
||||
using Kokkos::Experimental::deep_copy ;
|
||||
using Kokkos::Experimental::create_mirror ;
|
||||
using Kokkos::Experimental::create_mirror_view ;
|
||||
using Kokkos::Experimental::subdynrankview ;
|
||||
using Kokkos::Experimental::subview ;
|
||||
using Kokkos::Experimental::resize ;
|
||||
using Kokkos::Experimental::realloc ;
|
||||
|
||||
} //end Kokkos
|
||||
#endif
|
||||
|
||||
|
|
|
@ -35,7 +35,7 @@
|
|||
// NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE OF THIS
|
||||
// SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.
|
||||
//
|
||||
// Questions? Contact H. Carter Edwards (hcedwar@sandia.gov)
|
||||
// Questions? Contact Christian R. Trott (crtrott@sandia.gov)
|
||||
//
|
||||
// ************************************************************************
|
||||
//@HEADER
|
||||
|
@ -52,7 +52,33 @@
|
|||
namespace Kokkos {
|
||||
namespace Experimental {
|
||||
|
||||
// Simple metafunction for choosing memory space
|
||||
// In the current implementation, if memory_space == CudaSpace,
|
||||
// use CudaUVMSpace for the chunk 'array' allocation, which
|
||||
// contains will contain pointers to chunks of memory allocated
|
||||
// in CudaSpace
|
||||
namespace Impl {
|
||||
template < class MemSpace >
|
||||
struct ChunkArraySpace {
|
||||
using memory_space = MemSpace;
|
||||
};
|
||||
|
||||
#ifdef KOKKOS_ENABLE_CUDA
|
||||
template <>
|
||||
struct ChunkArraySpace< Kokkos::CudaSpace > {
|
||||
using memory_space = typename Kokkos::CudaUVMSpace;
|
||||
};
|
||||
#endif
|
||||
#ifdef KOKKOS_ENABLE_ROCM
|
||||
template <>
|
||||
struct ChunkArraySpace< Kokkos::Experimental::ROCmSpace > {
|
||||
using memory_space = typename Kokkos::Experimental::ROCmHostPinnedSpace;
|
||||
};
|
||||
#endif
|
||||
} // end namespace Impl
|
||||
|
||||
/** \brief Dynamic views are restricted to rank-one and no layout.
|
||||
* Resize only occurs on host outside of parallel_regions.
|
||||
* Subviews are not allowed.
|
||||
*/
|
||||
template< typename DataType , typename ... P >
|
||||
|
@ -66,7 +92,7 @@ private:
|
|||
|
||||
template< class , class ... > friend class DynamicView ;
|
||||
|
||||
typedef Kokkos::Experimental::Impl::SharedAllocationTracker track_type ;
|
||||
typedef Kokkos::Impl::SharedAllocationTracker track_type ;
|
||||
|
||||
static_assert( traits::rank == 1 && traits::rank_dynamic == 1
|
||||
, "DynamicView must be rank-one" );
|
||||
|
@ -86,18 +112,14 @@ private:
|
|||
{ Kokkos::abort("Kokkos::DynamicView ERROR: attempt to access inaccessible memory space"); };
|
||||
};
|
||||
|
||||
public:
|
||||
|
||||
typedef Kokkos::MemoryPool< typename traits::device_type > memory_pool ;
|
||||
|
||||
private:
|
||||
|
||||
memory_pool m_pool ;
|
||||
track_type m_track ;
|
||||
typename traits::value_type ** m_chunks ;
|
||||
unsigned m_chunk_shift ;
|
||||
unsigned m_chunk_mask ;
|
||||
unsigned m_chunk_max ;
|
||||
typename traits::value_type ** m_chunks ; // array of pointers to 'chunks' of memory
|
||||
unsigned m_chunk_shift ; // ceil(log2(m_chunk_size))
|
||||
unsigned m_chunk_mask ; // m_chunk_size - 1
|
||||
unsigned m_chunk_max ; // number of entries in the chunk array - each pointing to a chunk of extent == m_chunk_size entries
|
||||
unsigned m_chunk_size ; // 2 << (m_chunk_shift - 1)
|
||||
|
||||
public:
|
||||
|
||||
|
@ -125,28 +147,24 @@ public:
|
|||
|
||||
enum { Rank = 1 };
|
||||
|
||||
KOKKOS_INLINE_FUNCTION
|
||||
size_t allocation_extent() const noexcept
|
||||
{
|
||||
uintptr_t n = *reinterpret_cast<const uintptr_t*>( m_chunks + m_chunk_max );
|
||||
return (n << m_chunk_shift);
|
||||
}
|
||||
|
||||
KOKKOS_INLINE_FUNCTION
|
||||
size_t chunk_size() const noexcept
|
||||
{
|
||||
return m_chunk_size;
|
||||
}
|
||||
|
||||
KOKKOS_INLINE_FUNCTION
|
||||
size_t size() const noexcept
|
||||
{
|
||||
uintptr_t n = 0 ;
|
||||
|
||||
if ( Kokkos::Impl::MemorySpaceAccess
|
||||
< Kokkos::Impl::ActiveExecutionMemorySpace
|
||||
, typename traits::memory_space
|
||||
>::accessible ) {
|
||||
n = *reinterpret_cast<const uintptr_t*>( m_chunks + m_chunk_max );
|
||||
}
|
||||
#if defined( KOKKOS_ACTIVE_EXECUTION_MEMORY_SPACE_HOST )
|
||||
else {
|
||||
Kokkos::Impl::DeepCopy< Kokkos::HostSpace
|
||||
, typename traits::memory_space
|
||||
, Kokkos::HostSpace::execution_space >
|
||||
( & n
|
||||
, reinterpret_cast<const uintptr_t*>( m_chunks + m_chunk_max )
|
||||
, sizeof(uintptr_t) );
|
||||
}
|
||||
#endif
|
||||
return n << m_chunk_shift ;
|
||||
size_t extent_0 = *reinterpret_cast<const size_t*>( m_chunks + m_chunk_max +1 );
|
||||
return extent_0;
|
||||
}
|
||||
|
||||
template< typename iType >
|
||||
|
@ -159,6 +177,7 @@ public:
|
|||
size_t extent_int( const iType & r ) const
|
||||
{ return r == 0 ? size() : 1 ; }
|
||||
|
||||
#ifdef KOKKOS_ENABLE_DEPRECATED_CODE
|
||||
KOKKOS_INLINE_FUNCTION size_t dimension_0() const { return size(); }
|
||||
KOKKOS_INLINE_FUNCTION constexpr size_t dimension_1() const { return 1 ; }
|
||||
KOKKOS_INLINE_FUNCTION constexpr size_t dimension_2() const { return 1 ; }
|
||||
|
@ -167,6 +186,7 @@ public:
|
|||
KOKKOS_INLINE_FUNCTION constexpr size_t dimension_5() const { return 1 ; }
|
||||
KOKKOS_INLINE_FUNCTION constexpr size_t dimension_6() const { return 1 ; }
|
||||
KOKKOS_INLINE_FUNCTION constexpr size_t dimension_7() const { return 1 ; }
|
||||
#endif
|
||||
|
||||
KOKKOS_INLINE_FUNCTION constexpr size_t stride_0() const { return 0 ; }
|
||||
KOKKOS_INLINE_FUNCTION constexpr size_t stride_1() const { return 0 ; }
|
||||
|
@ -180,6 +200,17 @@ public:
|
|||
template< typename iType >
|
||||
KOKKOS_INLINE_FUNCTION void stride( iType * const s ) const { *s = 0 ; }
|
||||
|
||||
//----------------------------------------
|
||||
// Allocation tracking properties
|
||||
|
||||
KOKKOS_INLINE_FUNCTION
|
||||
int use_count() const
|
||||
{ return m_track.use_count(); }
|
||||
|
||||
inline
|
||||
const std::string label() const
|
||||
{ return m_track.template get_label< typename traits::memory_space >(); }
|
||||
|
||||
//----------------------------------------------------------------------
|
||||
// Range span is the span which contains all members.
|
||||
|
||||
|
@ -234,65 +265,15 @@ public:
|
|||
}
|
||||
|
||||
//----------------------------------------
|
||||
/** \brief Resizing in parallel only increases the array size,
|
||||
* never decrease.
|
||||
*/
|
||||
KOKKOS_INLINE_FUNCTION
|
||||
void resize_parallel( size_t n ) const
|
||||
{
|
||||
typedef typename traits::value_type value_type ;
|
||||
|
||||
DynamicView::template verify_space< Kokkos::Impl::ActiveExecutionMemorySpace >::check();
|
||||
|
||||
const uintptr_t NC = ( n + m_chunk_mask ) >> m_chunk_shift ;
|
||||
|
||||
if ( m_chunk_max < NC ) {
|
||||
#if defined( KOKKOS_ENABLE_DEBUG_BOUNDS_CHECK )
|
||||
printf("DynamicView::resize_parallel(%lu) m_chunk_max(%u) NC(%lu)\n"
|
||||
, n , m_chunk_max , NC );
|
||||
#endif
|
||||
Kokkos::abort("DynamicView::resize_parallel exceeded maximum size");
|
||||
}
|
||||
|
||||
typename traits::value_type * volatile * const ch = m_chunks ;
|
||||
|
||||
// The allocated chunk counter is m_chunks[ m_chunk_max ]
|
||||
uintptr_t volatile * const pc =
|
||||
reinterpret_cast<uintptr_t volatile*>( m_chunks + m_chunk_max );
|
||||
|
||||
// Potentially concurrent iteration of allocation to the required size.
|
||||
|
||||
for ( uintptr_t jc = *pc ; jc < NC ; ) {
|
||||
|
||||
// Claim the 'jc' chunk to-be-allocated index
|
||||
|
||||
const uintptr_t jc_try = jc ;
|
||||
|
||||
// Jump iteration to the chunk counter.
|
||||
|
||||
jc = atomic_compare_exchange( pc , jc_try , jc_try + 1 );
|
||||
|
||||
if ( jc_try == jc ) {
|
||||
|
||||
ch[jc_try] = reinterpret_cast<value_type*>(
|
||||
m_pool.allocate( sizeof(value_type) << m_chunk_shift ));
|
||||
|
||||
if ( 0 == ch[jc_try] ) {
|
||||
Kokkos::abort("DynamicView::resize_parallel exhausted memory pool");
|
||||
}
|
||||
|
||||
Kokkos::memory_fence();
|
||||
}
|
||||
}
|
||||
}
|
||||
|
||||
/** \brief Resizing in serial can grow or shrink the array size, */
|
||||
/** \brief Resizing in serial can grow or shrink the array size
|
||||
* up to the maximum number of chunks
|
||||
* */
|
||||
template< typename IntType >
|
||||
inline
|
||||
typename std::enable_if
|
||||
< std::is_integral<IntType>::value &&
|
||||
Kokkos::Impl::MemorySpaceAccess< Kokkos::HostSpace
|
||||
, typename traits::memory_space
|
||||
, typename Impl::ChunkArraySpace< typename traits::memory_space >::memory_space
|
||||
>::accessible
|
||||
>::type
|
||||
resize_serial( IntType const & n )
|
||||
|
@ -300,108 +281,35 @@ public:
|
|||
typedef typename traits::value_type value_type ;
|
||||
typedef value_type * pointer_type ;
|
||||
|
||||
const uintptr_t NC = ( n + m_chunk_mask ) >> m_chunk_shift ;
|
||||
const uintptr_t NC = ( n + m_chunk_mask ) >> m_chunk_shift ; // New total number of chunks needed for resize
|
||||
|
||||
if ( m_chunk_max < NC ) {
|
||||
Kokkos::abort("DynamicView::resize_serial exceeded maximum size");
|
||||
}
|
||||
|
||||
// *m_chunks[m_chunk_max] stores the current number of chunks being used
|
||||
uintptr_t * const pc =
|
||||
reinterpret_cast<uintptr_t*>( m_chunks + m_chunk_max );
|
||||
|
||||
if ( *pc < NC ) {
|
||||
while ( *pc < NC ) {
|
||||
m_chunks[*pc] = reinterpret_cast<pointer_type>
|
||||
( m_pool.allocate( sizeof(value_type) << m_chunk_shift ) );
|
||||
(
|
||||
typename traits::memory_space().allocate( sizeof(value_type) << m_chunk_shift )
|
||||
);
|
||||
++*pc ;
|
||||
}
|
||||
}
|
||||
else {
|
||||
while ( NC + 1 <= *pc ) {
|
||||
--*pc ;
|
||||
m_pool.deallocate( m_chunks[*pc]
|
||||
, sizeof(value_type) << m_chunk_shift );
|
||||
typename traits::memory_space().deallocate( m_chunks[*pc]
|
||||
, sizeof(value_type) << m_chunk_shift );
|
||||
m_chunks[*pc] = 0 ;
|
||||
}
|
||||
}
|
||||
}
|
||||
|
||||
//----------------------------------------
|
||||
|
||||
struct ResizeSerial {
|
||||
memory_pool m_pool ;
|
||||
typename traits::value_type ** m_chunks ;
|
||||
uintptr_t * m_pc ;
|
||||
uintptr_t m_nc ;
|
||||
unsigned m_chunk_shift ;
|
||||
|
||||
KOKKOS_INLINE_FUNCTION
|
||||
void operator()( int ) const
|
||||
{
|
||||
typedef typename traits::value_type value_type ;
|
||||
typedef value_type * pointer_type ;
|
||||
|
||||
if ( *m_pc < m_nc ) {
|
||||
while ( *m_pc < m_nc ) {
|
||||
m_chunks[*m_pc] = reinterpret_cast<pointer_type>
|
||||
( m_pool.allocate( sizeof(value_type) << m_chunk_shift ) );
|
||||
++*m_pc ;
|
||||
}
|
||||
}
|
||||
else {
|
||||
while ( m_nc + 1 <= *m_pc ) {
|
||||
--*m_pc ;
|
||||
m_pool.deallocate( m_chunks[*m_pc]
|
||||
, sizeof(value_type) << m_chunk_shift );
|
||||
m_chunks[*m_pc] = 0 ;
|
||||
}
|
||||
}
|
||||
}
|
||||
|
||||
ResizeSerial( memory_pool const & arg_pool
|
||||
, typename traits::value_type ** arg_chunks
|
||||
, uintptr_t * arg_pc
|
||||
, uintptr_t arg_nc
|
||||
, unsigned arg_chunk_shift
|
||||
)
|
||||
: m_pool( arg_pool )
|
||||
, m_chunks( arg_chunks )
|
||||
, m_pc( arg_pc )
|
||||
, m_nc( arg_nc )
|
||||
, m_chunk_shift( arg_chunk_shift )
|
||||
{}
|
||||
};
|
||||
|
||||
template< typename IntType >
|
||||
inline
|
||||
typename std::enable_if
|
||||
< std::is_integral<IntType>::value &&
|
||||
! Kokkos::Impl::MemorySpaceAccess< Kokkos::HostSpace
|
||||
, typename traits::memory_space
|
||||
>::accessible
|
||||
>::type
|
||||
resize_serial( IntType const & n )
|
||||
{
|
||||
const uintptr_t NC = ( n + m_chunk_mask ) >> m_chunk_shift ;
|
||||
|
||||
if ( m_chunk_max < NC ) {
|
||||
Kokkos::abort("DynamicView::resize_serial exceeded maximum size");
|
||||
}
|
||||
|
||||
// Must dispatch kernel
|
||||
|
||||
typedef Kokkos::RangePolicy< typename traits::execution_space > Range ;
|
||||
|
||||
uintptr_t * const pc =
|
||||
reinterpret_cast<uintptr_t*>( m_chunks + m_chunk_max );
|
||||
|
||||
Kokkos::Impl::ParallelFor<ResizeSerial,Range>
|
||||
closure( ResizeSerial( m_pool, m_chunks, pc, NC, m_chunk_shift )
|
||||
, Range(0,1) );
|
||||
|
||||
closure.execute();
|
||||
|
||||
traits::execution_space::fence();
|
||||
// *m_chunks[m_chunk_max+1] stores the 'extent' requested by resize
|
||||
*(pc+1) = n;
|
||||
}
|
||||
|
||||
//----------------------------------------------------------------------
|
||||
|
@ -415,12 +323,12 @@ public:
|
|||
|
||||
template< class RT , class ... RP >
|
||||
DynamicView( const DynamicView<RT,RP...> & rhs )
|
||||
: m_pool( rhs.m_pool )
|
||||
, m_track( rhs.m_track )
|
||||
: m_track( rhs.m_track )
|
||||
, m_chunks( (typename traits::value_type **) rhs.m_chunks )
|
||||
, m_chunk_shift( rhs.m_chunk_shift )
|
||||
, m_chunk_mask( rhs.m_chunk_mask )
|
||||
, m_chunk_max( rhs.m_chunk_max )
|
||||
, m_chunk_size( rhs.m_chunk_size )
|
||||
{
|
||||
typedef typename DynamicView<RT,RP...>::traits SrcTraits ;
|
||||
typedef Kokkos::Impl::ViewMapping< traits , SrcTraits , void > Mapping ;
|
||||
|
@ -430,35 +338,36 @@ public:
|
|||
//----------------------------------------------------------------------
|
||||
|
||||
struct Destroy {
|
||||
memory_pool m_pool ;
|
||||
typename traits::value_type ** m_chunks ;
|
||||
unsigned m_chunk_max ;
|
||||
bool m_destroy ;
|
||||
unsigned m_chunk_size ;
|
||||
|
||||
// Initialize or destroy array of chunk pointers.
|
||||
// Two entries beyond the max chunks are allocation counters.
|
||||
|
||||
KOKKOS_INLINE_FUNCTION
|
||||
inline
|
||||
void operator()( unsigned i ) const
|
||||
{
|
||||
if ( m_destroy && i < m_chunk_max && 0 != m_chunks[i] ) {
|
||||
m_pool.deallocate( m_chunks[i] , m_pool.min_block_size() );
|
||||
typename traits::memory_space().deallocate( m_chunks[i], m_chunk_size );
|
||||
}
|
||||
m_chunks[i] = 0 ;
|
||||
}
|
||||
|
||||
void execute( bool arg_destroy )
|
||||
{
|
||||
typedef Kokkos::RangePolicy< typename traits::execution_space > Range ;
|
||||
typedef Kokkos::RangePolicy< typename HostSpace::execution_space > Range ;
|
||||
//typedef Kokkos::RangePolicy< typename Impl::ChunkArraySpace< typename traits::memory_space >::memory_space::execution_space > Range ;
|
||||
|
||||
m_destroy = arg_destroy ;
|
||||
|
||||
Kokkos::Impl::ParallelFor<Destroy,Range>
|
||||
closure( *this , Range(0, m_chunk_max + 1) );
|
||||
closure( *this , Range(0, m_chunk_max + 2) ); // Add 2 to 'destroy' extra slots storing num_chunks and extent; previously + 1
|
||||
|
||||
closure.execute();
|
||||
|
||||
traits::execution_space::fence();
|
||||
//Impl::ChunkArraySpace< typename traits::memory_space >::memory_space::execution_space::fence();
|
||||
}
|
||||
|
||||
void construct_shared_allocation()
|
||||
|
@ -473,66 +382,64 @@ public:
|
|||
Destroy & operator = ( Destroy && ) = default ;
|
||||
Destroy & operator = ( const Destroy & ) = default ;
|
||||
|
||||
Destroy( const memory_pool & arg_pool
|
||||
, typename traits::value_type ** arg_chunk
|
||||
, const unsigned arg_chunk_max )
|
||||
: m_pool( arg_pool )
|
||||
, m_chunks( arg_chunk )
|
||||
Destroy( typename traits::value_type ** arg_chunk
|
||||
, const unsigned arg_chunk_max
|
||||
, const unsigned arg_chunk_size )
|
||||
: m_chunks( arg_chunk )
|
||||
, m_chunk_max( arg_chunk_max )
|
||||
, m_destroy( false )
|
||||
, m_chunk_size( arg_chunk_size )
|
||||
{}
|
||||
};
|
||||
|
||||
|
||||
/**\brief Allocation constructor
|
||||
*
|
||||
* Memory is allocated in chunks from the memory pool.
|
||||
* The chunk size conforms to the memory pool's chunk size.
|
||||
* Memory is allocated in chunks
|
||||
* A maximum size is required in order to allocate a
|
||||
* chunk-pointer array.
|
||||
*/
|
||||
explicit inline
|
||||
DynamicView( const std::string & arg_label
|
||||
, const memory_pool & arg_pool
|
||||
, const size_t arg_size_max )
|
||||
: m_pool( arg_pool )
|
||||
, m_track()
|
||||
, const unsigned min_chunk_size
|
||||
, const unsigned max_extent )
|
||||
: m_track()
|
||||
, m_chunks(0)
|
||||
// The memory pool chunk is guaranteed to be a power of two
|
||||
// The chunk size is guaranteed to be a power of two
|
||||
, m_chunk_shift(
|
||||
Kokkos::Impl::integral_power_of_two(
|
||||
m_pool.min_block_size()/sizeof(typename traits::value_type)) )
|
||||
, m_chunk_mask( ( 1 << m_chunk_shift ) - 1 )
|
||||
, m_chunk_max( ( arg_size_max + m_chunk_mask ) >> m_chunk_shift )
|
||||
Kokkos::Impl::integral_power_of_two_that_contains( min_chunk_size ) ) // div ceil(log2(min_chunk_size))
|
||||
, m_chunk_mask( ( 1 << m_chunk_shift ) - 1 ) // mod
|
||||
, m_chunk_max( ( max_extent + m_chunk_mask ) >> m_chunk_shift ) // max num pointers-to-chunks in array
|
||||
, m_chunk_size ( 2 << (m_chunk_shift - 1) )
|
||||
{
|
||||
typedef typename Impl::ChunkArraySpace< typename traits::memory_space >::memory_space chunk_array_memory_space;
|
||||
// A functor to deallocate all of the chunks upon final destruction
|
||||
|
||||
typedef typename traits::memory_space memory_space ;
|
||||
typedef Kokkos::Experimental::Impl::SharedAllocationRecord< memory_space , Destroy > record_type ;
|
||||
typedef Kokkos::Impl::SharedAllocationRecord< chunk_array_memory_space , Destroy > record_type ;
|
||||
|
||||
// Allocate chunk pointers and allocation counter
|
||||
record_type * const record =
|
||||
record_type::allocate( memory_space()
|
||||
record_type::allocate( chunk_array_memory_space()
|
||||
, arg_label
|
||||
, ( sizeof(pointer_type) * ( m_chunk_max + 1 ) ) );
|
||||
, ( sizeof(pointer_type) * ( m_chunk_max + 2 ) ) );
|
||||
// Allocate + 2 extra slots so that *m_chunk[m_chunk_max] == num_chunks_alloc and *m_chunk[m_chunk_max+1] == extent
|
||||
// This must match in Destroy's execute(...) method
|
||||
|
||||
m_chunks = reinterpret_cast<pointer_type*>( record->data() );
|
||||
|
||||
record->m_destroy = Destroy( m_pool , m_chunks , m_chunk_max );
|
||||
record->m_destroy = Destroy( m_chunks , m_chunk_max, m_chunk_size );
|
||||
|
||||
// Initialize to zero
|
||||
|
||||
record->m_destroy.construct_shared_allocation();
|
||||
|
||||
m_track.assign_allocated_record_to_uninitialized( record );
|
||||
}
|
||||
|
||||
};
|
||||
|
||||
} // namespace Experimental
|
||||
} // namespace Kokkos
|
||||
|
||||
namespace Kokkos {
|
||||
namespace Experimental {
|
||||
|
||||
template< class T , class ... P >
|
||||
inline
|
||||
|
@ -545,11 +452,11 @@ create_mirror_view( const Kokkos::Experimental::DynamicView<T,P...> & src )
|
|||
template< class T , class ... DP , class ... SP >
|
||||
inline
|
||||
void deep_copy( const View<T,DP...> & dst
|
||||
, const DynamicView<T,SP...> & src
|
||||
, const Kokkos::Experimental::DynamicView<T,SP...> & src
|
||||
)
|
||||
{
|
||||
typedef View<T,DP...> dst_type ;
|
||||
typedef DynamicView<T,SP...> src_type ;
|
||||
typedef Kokkos::Experimental::DynamicView<T,SP...> src_type ;
|
||||
|
||||
typedef typename ViewTraits<T,DP...>::execution_space dst_execution_space ;
|
||||
typedef typename ViewTraits<T,SP...>::memory_space src_memory_space ;
|
||||
|
@ -568,11 +475,11 @@ void deep_copy( const View<T,DP...> & dst
|
|||
|
||||
template< class T , class ... DP , class ... SP >
|
||||
inline
|
||||
void deep_copy( const DynamicView<T,DP...> & dst
|
||||
void deep_copy( const Kokkos::Experimental::DynamicView<T,DP...> & dst
|
||||
, const View<T,SP...> & src
|
||||
)
|
||||
{
|
||||
typedef DynamicView<T,SP...> dst_type ;
|
||||
typedef Kokkos::Experimental::DynamicView<T,SP...> dst_type ;
|
||||
typedef View<T,DP...> src_type ;
|
||||
|
||||
typedef typename ViewTraits<T,DP...>::execution_space dst_execution_space ;
|
||||
|
@ -590,7 +497,81 @@ void deep_copy( const DynamicView<T,DP...> & dst
|
|||
}
|
||||
}
|
||||
|
||||
} // namespace Experimental
|
||||
namespace Impl {
|
||||
template<class Arg0, class ... DP , class ... SP>
|
||||
struct CommonSubview<Kokkos::Experimental::DynamicView<DP...>,Kokkos::Experimental::DynamicView<SP...>,1,Arg0> {
|
||||
typedef Kokkos::Experimental::DynamicView<DP...> DstType;
|
||||
typedef Kokkos::Experimental::DynamicView<SP...> SrcType;
|
||||
typedef DstType dst_subview_type;
|
||||
typedef SrcType src_subview_type;
|
||||
dst_subview_type dst_sub;
|
||||
src_subview_type src_sub;
|
||||
CommonSubview(const DstType& dst, const SrcType& src, const Arg0& arg0):
|
||||
dst_sub(dst),src_sub(src) {}
|
||||
};
|
||||
|
||||
template<class ...DP, class SrcType, class Arg0>
|
||||
struct CommonSubview<Kokkos::Experimental::DynamicView<DP...>,SrcType,1,Arg0> {
|
||||
typedef Kokkos::Experimental::DynamicView<DP...> DstType;
|
||||
typedef DstType dst_subview_type;
|
||||
typedef typename Kokkos::Subview<SrcType,Arg0> src_subview_type;
|
||||
dst_subview_type dst_sub;
|
||||
src_subview_type src_sub;
|
||||
CommonSubview(const DstType& dst, const SrcType& src, const Arg0& arg0):
|
||||
dst_sub(dst),src_sub(src,arg0) {}
|
||||
};
|
||||
|
||||
template<class DstType, class ...SP, class Arg0>
|
||||
struct CommonSubview<DstType,Kokkos::Experimental::DynamicView<SP...>,1,Arg0> {
|
||||
typedef Kokkos::Experimental::DynamicView<SP...> SrcType;
|
||||
typedef typename Kokkos::Subview<DstType,Arg0> dst_subview_type;
|
||||
typedef SrcType src_subview_type;
|
||||
dst_subview_type dst_sub;
|
||||
src_subview_type src_sub;
|
||||
CommonSubview(const DstType& dst, const SrcType& src, const Arg0& arg0):
|
||||
dst_sub(dst,arg0),src_sub(src) {}
|
||||
};
|
||||
|
||||
template<class ...DP,class ViewTypeB, class Layout, class ExecSpace,typename iType>
|
||||
struct ViewCopy<Kokkos::Experimental::DynamicView<DP...>,ViewTypeB,Layout,ExecSpace,1,iType> {
|
||||
Kokkos::Experimental::DynamicView<DP...> a;
|
||||
ViewTypeB b;
|
||||
|
||||
typedef Kokkos::RangePolicy<ExecSpace,Kokkos::IndexType<iType>> policy_type;
|
||||
|
||||
ViewCopy(const Kokkos::Experimental::DynamicView<DP...>& a_, const ViewTypeB& b_):a(a_),b(b_) {
|
||||
Kokkos::parallel_for("Kokkos::ViewCopy-2D",
|
||||
policy_type(0,b.extent(0)),*this);
|
||||
}
|
||||
|
||||
KOKKOS_INLINE_FUNCTION
|
||||
void operator() (const iType& i0) const {
|
||||
a(i0) = b(i0);
|
||||
};
|
||||
};
|
||||
|
||||
template<class ...DP,class ...SP, class Layout, class ExecSpace,typename iType>
|
||||
struct ViewCopy<Kokkos::Experimental::DynamicView<DP...>,
|
||||
Kokkos::Experimental::DynamicView<SP...>,Layout,ExecSpace,1,iType> {
|
||||
Kokkos::Experimental::DynamicView<DP...> a;
|
||||
Kokkos::Experimental::DynamicView<SP...> b;
|
||||
|
||||
typedef Kokkos::RangePolicy<ExecSpace,Kokkos::IndexType<iType>> policy_type;
|
||||
|
||||
ViewCopy(const Kokkos::Experimental::DynamicView<DP...>& a_,
|
||||
const Kokkos::Experimental::DynamicView<SP...>& b_):a(a_),b(b_) {
|
||||
const iType n = std::min(a.extent(0),b.extent(0));
|
||||
Kokkos::parallel_for("Kokkos::ViewCopy-2D",
|
||||
policy_type(0,n),*this);
|
||||
}
|
||||
|
||||
KOKKOS_INLINE_FUNCTION
|
||||
void operator() (const iType& i0) const {
|
||||
a(i0) = b(i0);
|
||||
};
|
||||
};
|
||||
|
||||
} // namespace Impl
|
||||
} // namespace Kokkos
|
||||
|
||||
#endif /* #ifndef KOKKOS_DYNAMIC_VIEW_HPP */
|
||||
|
|
|
@ -35,7 +35,7 @@
|
|||
// NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE OF THIS
|
||||
// SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.
|
||||
//
|
||||
// Questions? Contact H. Carter Edwards (hcedwar@sandia.gov)
|
||||
// Questions? Contact Christian R. Trott (crtrott@sandia.gov)
|
||||
//
|
||||
// ************************************************************************
|
||||
//@HEADER
|
||||
|
@ -69,7 +69,7 @@ public:
|
|||
clear();
|
||||
}
|
||||
|
||||
int getCapacity() const { return m_reports.h_view.dimension_0(); }
|
||||
int getCapacity() const { return m_reports.h_view.extent(0); }
|
||||
|
||||
int getNumReports();
|
||||
|
||||
|
@ -90,7 +90,7 @@ public:
|
|||
{
|
||||
int idx = Kokkos::atomic_fetch_add(&m_numReportsAttempted(), 1);
|
||||
|
||||
if (idx >= 0 && (idx < static_cast<int>(m_reports.d_view.dimension_0()))) {
|
||||
if (idx >= 0 && (idx < static_cast<int>(m_reports.d_view.extent(0)))) {
|
||||
m_reporters.d_view(idx) = reporter_id;
|
||||
m_reports.d_view(idx) = report;
|
||||
return true;
|
||||
|
@ -118,8 +118,8 @@ inline int ErrorReporter<ReportType, DeviceType>::getNumReports()
|
|||
{
|
||||
int num_reports = 0;
|
||||
Kokkos::deep_copy(num_reports,m_numReportsAttempted);
|
||||
if (num_reports > static_cast<int>(m_reports.h_view.dimension_0())) {
|
||||
num_reports = m_reports.h_view.dimension_0();
|
||||
if (num_reports > static_cast<int>(m_reports.h_view.extent(0))) {
|
||||
num_reports = m_reports.h_view.extent(0);
|
||||
}
|
||||
return num_reports;
|
||||
}
|
||||
|
|
|
@ -34,7 +34,7 @@
|
|||
// NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE OF THIS
|
||||
// SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.
|
||||
//
|
||||
// Questions? Contact H. Carter Edwards (hcedwar@sandia.gov)
|
||||
// Questions? Contact Christian R. Trott (crtrott@sandia.gov)
|
||||
//
|
||||
// ************************************************************************
|
||||
//@HEADER
|
||||
|
|
|
@ -35,7 +35,7 @@
|
|||
// NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE OF THIS
|
||||
// SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.
|
||||
//
|
||||
// Questions? Contact H. Carter Edwards (hcedwar@sandia.gov)
|
||||
// Questions? Contact Christian R. Trott (crtrott@sandia.gov)
|
||||
//
|
||||
// ************************************************************************
|
||||
//@HEADER
|
||||
|
@ -623,14 +623,12 @@ public:
|
|||
typename ExecSpace::memory_space,
|
||||
typename dest_type::memory_space>::value,
|
||||
"ScatterView deep_copy destination memory space not accessible");
|
||||
size_t strides[8];
|
||||
internal_view.stride(strides);
|
||||
bool is_equal = (dest.data() == internal_view.data());
|
||||
size_t start = is_equal ? 1 : 0;
|
||||
Kokkos::Impl::Experimental::ReduceDuplicates<ExecSpace, original_value_type, Op>(
|
||||
internal_view.data(),
|
||||
dest.data(),
|
||||
strides[0],
|
||||
internal_view.stride(0),
|
||||
start,
|
||||
internal_view.extent(0),
|
||||
internal_view.label());
|
||||
|
@ -772,9 +770,6 @@ public:
|
|||
typename ExecSpace::memory_space,
|
||||
typename dest_type::memory_space>::value,
|
||||
"ScatterView deep_copy destination memory space not accessible");
|
||||
size_t strides[8];
|
||||
internal_view.stride(strides);
|
||||
size_t stride = strides[internal_view_type::rank - 1];
|
||||
auto extent = internal_view.extent(
|
||||
internal_view_type::rank - 1);
|
||||
bool is_equal = (dest.data() == internal_view.data());
|
||||
|
@ -782,7 +777,7 @@ public:
|
|||
Kokkos::Impl::Experimental::ReduceDuplicates<ExecSpace, original_value_type, Op>(
|
||||
internal_view.data(),
|
||||
dest.data(),
|
||||
stride,
|
||||
internal_view.stride(internal_view_type::rank - 1),
|
||||
start,
|
||||
extent,
|
||||
internal_view.label());
|
||||
|
|
|
@ -35,7 +35,7 @@
|
|||
// NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE OF THIS
|
||||
// SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.
|
||||
//
|
||||
// Questions? Contact H. Carter Edwards (hcedwar@sandia.gov)
|
||||
// Questions? Contact Christian R. Trott (crtrott@sandia.gov)
|
||||
//
|
||||
// ************************************************************************
|
||||
//@HEADER
|
||||
|
@ -70,7 +70,7 @@ namespace Impl {
|
|||
|
||||
KOKKOS_INLINE_FUNCTION
|
||||
void operator() (const int_type& iRow) const {
|
||||
const int_type num_rows = row_offsets.dimension_0()-1;
|
||||
const int_type num_rows = row_offsets.extent(0)-1;
|
||||
const int_type num_entries = row_offsets(num_rows);
|
||||
const int_type total_cost = num_entries + num_rows*cost_per_row;
|
||||
|
||||
|
@ -105,7 +105,7 @@ namespace Impl {
|
|||
}
|
||||
} else {
|
||||
if((count >= (current_block + 1) * cost_per_workset) ||
|
||||
(iRow+2 == row_offsets.dimension_0())) {
|
||||
(iRow+2 == row_offsets.extent(0))) {
|
||||
if(end_block>current_block+1) {
|
||||
int_type num_block = end_block-current_block;
|
||||
row_block_offsets(current_block+1) = iRow;
|
||||
|
@ -330,8 +330,8 @@ public:
|
|||
*/
|
||||
KOKKOS_INLINE_FUNCTION
|
||||
size_type numRows() const {
|
||||
return (row_map.dimension_0 () != 0) ?
|
||||
row_map.dimension_0 () - static_cast<size_type> (1) :
|
||||
return (row_map.extent(0) != 0) ?
|
||||
row_map.extent(0) - static_cast<size_type> (1) :
|
||||
static_cast<size_type> (0);
|
||||
}
|
||||
|
||||
|
@ -458,7 +458,7 @@ DataType maximum_entry( const StaticCrsGraph< DataType , Arg1Type , Arg2Type , S
|
|||
typedef Impl::StaticCrsGraphMaximumEntry< GraphType > FunctorType ;
|
||||
|
||||
DataType result = 0 ;
|
||||
Kokkos::parallel_reduce( graph.entries.dimension_0(),
|
||||
Kokkos::parallel_reduce( graph.entries.extent(0),
|
||||
FunctorType(graph), result );
|
||||
return result ;
|
||||
}
|
||||
|
|
|
@ -35,7 +35,7 @@
|
|||
// NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE OF THIS
|
||||
// SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.
|
||||
//
|
||||
// Questions? Contact H. Carter Edwards (hcedwar@sandia.gov)
|
||||
// Questions? Contact Christian R. Trott (crtrott@sandia.gov)
|
||||
//
|
||||
// ************************************************************************
|
||||
//@HEADER
|
||||
|
@ -477,7 +477,7 @@ public:
|
|||
/// kernel.
|
||||
KOKKOS_INLINE_FUNCTION
|
||||
size_type hash_capacity() const
|
||||
{ return m_hash_lists.dimension_0(); }
|
||||
{ return m_hash_lists.extent(0); }
|
||||
|
||||
//---------------------------------------------------------------------------
|
||||
//---------------------------------------------------------------------------
|
||||
|
@ -507,13 +507,13 @@ public:
|
|||
int volatile & failed_insert_ref = m_scalars((int)failed_insert_idx) ;
|
||||
|
||||
const size_type hash_value = m_hasher(k);
|
||||
const size_type hash_list = hash_value % m_hash_lists.dimension_0();
|
||||
const size_type hash_list = hash_value % m_hash_lists.extent(0);
|
||||
|
||||
size_type * curr_ptr = & m_hash_lists[ hash_list ];
|
||||
size_type new_index = invalid_index ;
|
||||
|
||||
// Force integer multiply to long
|
||||
size_type index_hint = static_cast<size_type>( (static_cast<double>(hash_list) * capacity()) / m_hash_lists.dimension_0());
|
||||
size_type index_hint = static_cast<size_type>( (static_cast<double>(hash_list) * capacity()) / m_hash_lists.extent(0));
|
||||
|
||||
size_type find_attempts = 0;
|
||||
|
||||
|
@ -645,7 +645,7 @@ public:
|
|||
KOKKOS_INLINE_FUNCTION
|
||||
size_type find( const key_type & k) const
|
||||
{
|
||||
size_type curr = 0u < capacity() ? m_hash_lists( m_hasher(k) % m_hash_lists.dimension_0() ) : invalid_index ;
|
||||
size_type curr = 0u < capacity() ? m_hash_lists( m_hasher(k) % m_hash_lists.extent(0) ) : invalid_index ;
|
||||
|
||||
KOKKOS_NONTEMPORAL_PREFETCH_LOAD(&m_keys[curr != invalid_index ? curr : 0]);
|
||||
while (curr != invalid_index && !m_equal_to( m_keys[curr], k) ) {
|
||||
|
@ -741,7 +741,7 @@ public:
|
|||
>::type
|
||||
create_copy_view( UnorderedMap<SKey, SValue, SDevice, Hasher,EqualTo> const& src)
|
||||
{
|
||||
if (m_hash_lists.ptr_on_device() != src.m_hash_lists.ptr_on_device()) {
|
||||
if (m_hash_lists.data() != src.m_hash_lists.data()) {
|
||||
|
||||
insertable_map_type tmp;
|
||||
|
||||
|
@ -750,23 +750,23 @@ public:
|
|||
tmp.m_equal_to = src.m_equal_to;
|
||||
tmp.m_size = src.size();
|
||||
tmp.m_available_indexes = bitset_type( src.capacity() );
|
||||
tmp.m_hash_lists = size_type_view( ViewAllocateWithoutInitializing("UnorderedMap hash list"), src.m_hash_lists.dimension_0() );
|
||||
tmp.m_next_index = size_type_view( ViewAllocateWithoutInitializing("UnorderedMap next index"), src.m_next_index.dimension_0() );
|
||||
tmp.m_keys = key_type_view( ViewAllocateWithoutInitializing("UnorderedMap keys"), src.m_keys.dimension_0() );
|
||||
tmp.m_values = value_type_view( ViewAllocateWithoutInitializing("UnorderedMap values"), src.m_values.dimension_0() );
|
||||
tmp.m_hash_lists = size_type_view( ViewAllocateWithoutInitializing("UnorderedMap hash list"), src.m_hash_lists.extent(0) );
|
||||
tmp.m_next_index = size_type_view( ViewAllocateWithoutInitializing("UnorderedMap next index"), src.m_next_index.extent(0) );
|
||||
tmp.m_keys = key_type_view( ViewAllocateWithoutInitializing("UnorderedMap keys"), src.m_keys.extent(0) );
|
||||
tmp.m_values = value_type_view( ViewAllocateWithoutInitializing("UnorderedMap values"), src.m_values.extent(0) );
|
||||
tmp.m_scalars = scalars_view("UnorderedMap scalars");
|
||||
|
||||
Kokkos::deep_copy(tmp.m_available_indexes, src.m_available_indexes);
|
||||
|
||||
typedef Kokkos::Impl::DeepCopy< typename device_type::memory_space, typename SDevice::memory_space > raw_deep_copy;
|
||||
|
||||
raw_deep_copy(tmp.m_hash_lists.ptr_on_device(), src.m_hash_lists.ptr_on_device(), sizeof(size_type)*src.m_hash_lists.dimension_0());
|
||||
raw_deep_copy(tmp.m_next_index.ptr_on_device(), src.m_next_index.ptr_on_device(), sizeof(size_type)*src.m_next_index.dimension_0());
|
||||
raw_deep_copy(tmp.m_keys.ptr_on_device(), src.m_keys.ptr_on_device(), sizeof(key_type)*src.m_keys.dimension_0());
|
||||
raw_deep_copy(tmp.m_hash_lists.data(), src.m_hash_lists.data(), sizeof(size_type)*src.m_hash_lists.extent(0));
|
||||
raw_deep_copy(tmp.m_next_index.data(), src.m_next_index.data(), sizeof(size_type)*src.m_next_index.extent(0));
|
||||
raw_deep_copy(tmp.m_keys.data(), src.m_keys.data(), sizeof(key_type)*src.m_keys.extent(0));
|
||||
if (!is_set) {
|
||||
raw_deep_copy(tmp.m_values.ptr_on_device(), src.m_values.ptr_on_device(), sizeof(impl_value_type)*src.m_values.dimension_0());
|
||||
raw_deep_copy(tmp.m_values.data(), src.m_values.data(), sizeof(impl_value_type)*src.m_values.extent(0));
|
||||
}
|
||||
raw_deep_copy(tmp.m_scalars.ptr_on_device(), src.m_scalars.ptr_on_device(), sizeof(int)*num_scalars );
|
||||
raw_deep_copy(tmp.m_scalars.data(), src.m_scalars.data(), sizeof(int)*num_scalars );
|
||||
|
||||
*this = tmp;
|
||||
}
|
||||
|
@ -784,21 +784,21 @@ private: // private member functions
|
|||
{
|
||||
typedef Kokkos::Impl::DeepCopy< typename device_type::memory_space, Kokkos::HostSpace > raw_deep_copy;
|
||||
const int true_ = true;
|
||||
raw_deep_copy(m_scalars.ptr_on_device() + flag, &true_, sizeof(int));
|
||||
raw_deep_copy(m_scalars.data() + flag, &true_, sizeof(int));
|
||||
}
|
||||
|
||||
void reset_flag(int flag) const
|
||||
{
|
||||
typedef Kokkos::Impl::DeepCopy< typename device_type::memory_space, Kokkos::HostSpace > raw_deep_copy;
|
||||
const int false_ = false;
|
||||
raw_deep_copy(m_scalars.ptr_on_device() + flag, &false_, sizeof(int));
|
||||
raw_deep_copy(m_scalars.data() + flag, &false_, sizeof(int));
|
||||
}
|
||||
|
||||
bool get_flag(int flag) const
|
||||
{
|
||||
typedef Kokkos::Impl::DeepCopy< Kokkos::HostSpace, typename device_type::memory_space > raw_deep_copy;
|
||||
int result = false;
|
||||
raw_deep_copy(&result, m_scalars.ptr_on_device() + flag, sizeof(int));
|
||||
raw_deep_copy(&result, m_scalars.data() + flag, sizeof(int));
|
||||
return result;
|
||||
}
|
||||
|
||||
|
|
|
@ -35,7 +35,7 @@
|
|||
// NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE OF THIS
|
||||
// SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.
|
||||
//
|
||||
// Questions? Contact H. Carter Edwards (hcedwar@sandia.gov)
|
||||
// Questions? Contact Christian R. Trott (crtrott@sandia.gov)
|
||||
//
|
||||
// ************************************************************************
|
||||
//@HEADER
|
||||
|
|
|
@ -35,7 +35,7 @@
|
|||
// NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE OF THIS
|
||||
// SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.
|
||||
//
|
||||
// Questions? Contact H. Carter Edwards (hcedwar@sandia.gov)
|
||||
// Questions? Contact Christian R. Trott (crtrott@sandia.gov)
|
||||
//
|
||||
// ************************************************************************
|
||||
//@HEADER
|
||||
|
@ -80,7 +80,7 @@ struct BitsetCount
|
|||
size_type apply() const
|
||||
{
|
||||
size_type count = 0u;
|
||||
parallel_reduce( m_bitset.m_blocks.dimension_0(), *this, count );
|
||||
parallel_reduce( m_bitset.m_blocks.extent(0), *this, count );
|
||||
return count;
|
||||
}
|
||||
|
||||
|
|
|
@ -34,7 +34,7 @@
|
|||
// NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE OF THIS
|
||||
// SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.
|
||||
//
|
||||
// Questions? Contact H. Carter Edwards (hcedwar@sandia.gov)
|
||||
// Questions? Contact Christian R. Trott (crtrott@sandia.gov)
|
||||
//
|
||||
// ************************************************************************
|
||||
//@HEADER
|
||||
|
|
|
@ -35,7 +35,7 @@
|
|||
// NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE OF THIS
|
||||
// SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.
|
||||
//
|
||||
// Questions? Contact H. Carter Edwards (hcedwar@sandia.gov)
|
||||
// Questions? Contact Christian R. Trott (crtrott@sandia.gov)
|
||||
//
|
||||
// ************************************************************************
|
||||
//@HEADER
|
||||
|
|
|
@ -35,7 +35,7 @@
|
|||
// NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE OF THIS
|
||||
// SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.
|
||||
//
|
||||
// Questions? Contact H. Carter Edwards (hcedwar@sandia.gov)
|
||||
// Questions? Contact Christian R. Trott (crtrott@sandia.gov)
|
||||
//
|
||||
// ************************************************************************
|
||||
//@HEADER
|
||||
|
|
|
@ -35,7 +35,7 @@
|
|||
// NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE OF THIS
|
||||
// SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.
|
||||
//
|
||||
// Questions? Contact H. Carter Edwards (hcedwar@sandia.gov)
|
||||
// Questions? Contact Christian R. Trott (crtrott@sandia.gov)
|
||||
//
|
||||
// ************************************************************************
|
||||
//@HEADER
|
||||
|
@ -102,7 +102,7 @@ struct UnorderedMapErase
|
|||
|
||||
void apply() const
|
||||
{
|
||||
parallel_for(m_map.m_hash_lists.dimension_0(), *this);
|
||||
parallel_for(m_map.m_hash_lists.extent(0), *this);
|
||||
}
|
||||
|
||||
KOKKOS_INLINE_FUNCTION
|
||||
|
@ -170,7 +170,7 @@ struct UnorderedMapHistogram
|
|||
|
||||
void calculate()
|
||||
{
|
||||
parallel_for(m_map.m_hash_lists.dimension_0(), *this);
|
||||
parallel_for(m_map.m_hash_lists.extent(0), *this);
|
||||
}
|
||||
|
||||
void clear()
|
||||
|
@ -185,7 +185,7 @@ struct UnorderedMapHistogram
|
|||
host_histogram_view host_copy = create_mirror_view(m_length);
|
||||
Kokkos::deep_copy(host_copy, m_length);
|
||||
|
||||
for (int i=0, size = host_copy.dimension_0(); i<size; ++i)
|
||||
for (int i=0, size = host_copy.extent(0); i<size; ++i)
|
||||
{
|
||||
out << host_copy[i] << " , ";
|
||||
}
|
||||
|
@ -197,7 +197,7 @@ struct UnorderedMapHistogram
|
|||
host_histogram_view host_copy = create_mirror_view(m_distance);
|
||||
Kokkos::deep_copy(host_copy, m_distance);
|
||||
|
||||
for (int i=0, size = host_copy.dimension_0(); i<size; ++i)
|
||||
for (int i=0, size = host_copy.extent(0); i<size; ++i)
|
||||
{
|
||||
out << host_copy[i] << " , ";
|
||||
}
|
||||
|
@ -209,7 +209,7 @@ struct UnorderedMapHistogram
|
|||
host_histogram_view host_copy = create_mirror_view(m_block_distance);
|
||||
Kokkos::deep_copy(host_copy, m_block_distance);
|
||||
|
||||
for (int i=0, size = host_copy.dimension_0(); i<size; ++i)
|
||||
for (int i=0, size = host_copy.extent(0); i<size; ++i)
|
||||
{
|
||||
out << host_copy[i] << " , ";
|
||||
}
|
||||
|
@ -261,7 +261,7 @@ struct UnorderedMapPrint
|
|||
|
||||
void apply()
|
||||
{
|
||||
parallel_for(m_map.m_hash_lists.dimension_0(), *this);
|
||||
parallel_for(m_map.m_hash_lists.extent(0), *this);
|
||||
}
|
||||
|
||||
KOKKOS_INLINE_FUNCTION
|
||||
|
|
|
@ -34,7 +34,7 @@
|
|||
// NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE OF THIS
|
||||
// SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.
|
||||
//
|
||||
// Questions? Contact H. Carter Edwards (hcedwar@sandia.gov)
|
||||
// Questions? Contact Christian R. Trott (crtrott@sandia.gov)
|
||||
//
|
||||
// ************************************************************************
|
||||
//@HEADER
|
||||
|
|
|
@ -35,7 +35,7 @@
|
|||
// NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE OF THIS
|
||||
// SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.
|
||||
//
|
||||
// Questions? Contact H. Carter Edwards (hcedwar@sandia.gov)
|
||||
// Questions? Contact Christian R. Trott (crtrott@sandia.gov)
|
||||
//
|
||||
// ************************************************************************
|
||||
//@HEADER
|
||||
|
@ -83,13 +83,9 @@ protected:
|
|||
static void SetUpTestCase()
|
||||
{
|
||||
std::cout << std::setprecision(5) << std::scientific;
|
||||
Kokkos::HostSpace::execution_space::initialize();
|
||||
Kokkos::Cuda::initialize( Kokkos::Cuda::SelectDevice(0) );
|
||||
}
|
||||
static void TearDownTestCase()
|
||||
{
|
||||
Kokkos::Cuda::finalize();
|
||||
Kokkos::HostSpace::execution_space::finalize();
|
||||
}
|
||||
};
|
||||
|
||||
|
|
|
@ -35,7 +35,7 @@
|
|||
// NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE OF THIS
|
||||
// SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.
|
||||
//
|
||||
// Questions? Contact H. Carter Edwards (hcedwar@sandia.gov)
|
||||
// Questions? Contact Christian R. Trott (crtrott@sandia.gov)
|
||||
//
|
||||
// ************************************************************************
|
||||
//@HEADER
|
||||
|
@ -88,10 +88,10 @@ namespace Impl {
|
|||
|
||||
a.template sync<typename ViewType::host_mirror_space>();
|
||||
Scalar count = 0;
|
||||
for(unsigned int i = 0; i<a.d_view.dimension_0(); i++)
|
||||
for(unsigned int j = 0; j<a.d_view.dimension_1(); j++)
|
||||
for(unsigned int i = 0; i<a.d_view.extent(0); i++)
|
||||
for(unsigned int j = 0; j<a.d_view.extent(1); j++)
|
||||
count += a.h_view(i,j);
|
||||
return count - a.d_view.dimension_0()*a.d_view.dimension_1()-2-4-3*2;
|
||||
return count - a.d_view.extent(0)*a.d_view.extent(1)-2-4-3*2;
|
||||
}
|
||||
|
||||
|
||||
|
|
File diff suppressed because it is too large
Load Diff
|
@ -35,7 +35,7 @@
|
|||
// NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE OF THIS
|
||||
// SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.
|
||||
//
|
||||
// Questions? Contact H. Carter Edwards (hcedwar@sandia.gov)
|
||||
// Questions? Contact Christian R. Trott (crtrott@sandia.gov)
|
||||
//
|
||||
// ************************************************************************
|
||||
//@HEADER
|
||||
|
@ -61,114 +61,181 @@ struct TestDynamicView
|
|||
typedef typename Space::execution_space execution_space ;
|
||||
typedef typename Space::memory_space memory_space ;
|
||||
|
||||
typedef Kokkos::MemoryPool<typename Space::device_type> memory_pool_type;
|
||||
|
||||
typedef Kokkos::Experimental::DynamicView<Scalar*,Space> view_type;
|
||||
typedef typename view_type::const_type const_view_type ;
|
||||
|
||||
typedef typename Kokkos::TeamPolicy<execution_space>::member_type member_type ;
|
||||
typedef double value_type;
|
||||
|
||||
struct TEST {};
|
||||
struct VERIFY {};
|
||||
|
||||
view_type a;
|
||||
const unsigned total_size ;
|
||||
|
||||
TestDynamicView( const view_type & arg_a , const unsigned arg_total )
|
||||
: a(arg_a), total_size( arg_total ) {}
|
||||
|
||||
KOKKOS_INLINE_FUNCTION
|
||||
void operator() ( const TEST , member_type team_member, double& value) const
|
||||
{
|
||||
const unsigned int team_idx = team_member.league_rank() * team_member.team_size();
|
||||
|
||||
if ( team_member.team_rank() == 0 ) {
|
||||
unsigned n = team_idx + team_member.team_size();
|
||||
|
||||
if ( total_size < n ) n = total_size ;
|
||||
|
||||
a.resize_parallel( n );
|
||||
|
||||
if ( a.extent(0) < n ) {
|
||||
Kokkos::abort("GrowTest TEST failed resize_parallel");
|
||||
}
|
||||
}
|
||||
|
||||
// Make sure resize is done for all team members:
|
||||
team_member.team_barrier();
|
||||
|
||||
const unsigned int val = team_idx + team_member.team_rank();
|
||||
|
||||
if ( val < total_size ) {
|
||||
value += val ;
|
||||
|
||||
a( val ) = val ;
|
||||
}
|
||||
}
|
||||
|
||||
KOKKOS_INLINE_FUNCTION
|
||||
void operator() ( const VERIFY , member_type team_member, double& value) const
|
||||
{
|
||||
const unsigned int val =
|
||||
team_member.team_rank() +
|
||||
team_member.league_rank() * team_member.team_size();
|
||||
|
||||
if ( val < total_size ) {
|
||||
|
||||
if ( val != a(val) ) {
|
||||
Kokkos::abort("GrowTest VERIFY failed resize_parallel");
|
||||
}
|
||||
|
||||
value += a(val);
|
||||
}
|
||||
}
|
||||
|
||||
static void run( unsigned arg_total_size )
|
||||
{
|
||||
typedef Kokkos::TeamPolicy<execution_space,TEST> TestPolicy ;
|
||||
typedef Kokkos::TeamPolicy<execution_space,VERIFY> VerifyPolicy ;
|
||||
// Test: Create DynamicView, initialize size (via resize), run through parallel_for to set values, check values (via parallel_reduce); resize values and repeat
|
||||
// Case 1: min_chunk_size is a power of 2
|
||||
{
|
||||
view_type da("da", 1024, arg_total_size );
|
||||
ASSERT_EQ( da.size(), 0 );
|
||||
// Init
|
||||
unsigned da_size = arg_total_size / 8;
|
||||
da.resize_serial(da_size);
|
||||
ASSERT_EQ( da.size(), da_size );
|
||||
|
||||
// printf("TestDynamicView::run(%d) construct memory pool\n",arg_total_size);
|
||||
#if defined( KOKKOS_ENABLE_CXX11_DISPATCH_LAMBDA )
|
||||
#if !defined(KOKKOS_ENABLE_CUDA) || ( 8000 <= CUDA_VERSION )
|
||||
Kokkos::parallel_for( Kokkos::RangePolicy<execution_space>(0, da_size), KOKKOS_LAMBDA ( const int i )
|
||||
{
|
||||
da(i) = Scalar(i);
|
||||
}
|
||||
);
|
||||
|
||||
const size_t total_alloc_size = arg_total_size * sizeof(Scalar) * 1.2 ;
|
||||
const size_t superblock = std::min( total_alloc_size , size_t(1000000) );
|
||||
value_type result_sum = 0.0;
|
||||
Kokkos::parallel_reduce( Kokkos::RangePolicy<execution_space>(0, da_size), KOKKOS_LAMBDA ( const int i, value_type& partial_sum )
|
||||
{
|
||||
partial_sum += (value_type)da(i);
|
||||
}
|
||||
, result_sum
|
||||
);
|
||||
|
||||
memory_pool_type pool( memory_space()
|
||||
, total_alloc_size
|
||||
, 500 /* min block size in bytes */
|
||||
, 30000 /* max block size in bytes */
|
||||
, superblock
|
||||
);
|
||||
ASSERT_EQ(result_sum, (value_type)( da_size * (da_size - 1) / 2 ) );
|
||||
#endif
|
||||
#endif
|
||||
|
||||
// printf("TestDynamicView::run(%d) construct dynamic view\n",arg_total_size);
|
||||
// add 3x more entries i.e. 4x larger than previous size
|
||||
// the first 1/4 should remain the same
|
||||
unsigned da_resize = arg_total_size / 2;
|
||||
da.resize_serial(da_resize);
|
||||
ASSERT_EQ( da.size(), da_resize );
|
||||
|
||||
view_type da("A",pool,arg_total_size);
|
||||
#if defined( KOKKOS_ENABLE_CXX11_DISPATCH_LAMBDA )
|
||||
#if !defined(KOKKOS_ENABLE_CUDA) || ( 8000 <= CUDA_VERSION )
|
||||
Kokkos::parallel_for( Kokkos::RangePolicy<execution_space>(da_size, da_resize), KOKKOS_LAMBDA ( const int i )
|
||||
{
|
||||
da(i) = Scalar(i);
|
||||
}
|
||||
);
|
||||
|
||||
const_view_type ca(da);
|
||||
value_type new_result_sum = 0.0;
|
||||
Kokkos::parallel_reduce( Kokkos::RangePolicy<execution_space>(da_size, da_resize), KOKKOS_LAMBDA ( const int i, value_type& partial_sum )
|
||||
{
|
||||
partial_sum += (value_type)da(i);
|
||||
}
|
||||
, new_result_sum
|
||||
);
|
||||
|
||||
// printf("TestDynamicView::run(%d) construct test functor\n",arg_total_size);
|
||||
ASSERT_EQ(new_result_sum+result_sum, (value_type)( da_resize * (da_resize - 1) / 2 ) );
|
||||
#endif
|
||||
#endif
|
||||
} // end scope
|
||||
|
||||
TestDynamicView functor(da,arg_total_size);
|
||||
// Test: Create DynamicView, initialize size (via resize), run through parallel_for to set values, check values (via parallel_reduce); resize values and repeat
|
||||
// Case 2: min_chunk_size is NOT a power of 2
|
||||
{
|
||||
view_type da("da", 1023, arg_total_size );
|
||||
ASSERT_EQ( da.size(), 0 );
|
||||
// Init
|
||||
unsigned da_size = arg_total_size / 8;
|
||||
da.resize_serial(da_size);
|
||||
ASSERT_EQ( da.size(), da_size );
|
||||
|
||||
const unsigned team_size = TestPolicy::team_size_recommended(functor);
|
||||
const unsigned league_size = ( arg_total_size + team_size - 1 ) / team_size ;
|
||||
#if defined( KOKKOS_ENABLE_CXX11_DISPATCH_LAMBDA )
|
||||
#if !defined(KOKKOS_ENABLE_CUDA) || ( 8000 <= CUDA_VERSION )
|
||||
Kokkos::parallel_for( Kokkos::RangePolicy<execution_space>(0, da_size), KOKKOS_LAMBDA ( const int i )
|
||||
{
|
||||
da(i) = Scalar(i);
|
||||
}
|
||||
);
|
||||
|
||||
double reference = 0;
|
||||
double result = 0;
|
||||
value_type result_sum = 0.0;
|
||||
Kokkos::parallel_reduce( Kokkos::RangePolicy<execution_space>(0, da_size), KOKKOS_LAMBDA ( const int i, value_type& partial_sum )
|
||||
{
|
||||
partial_sum += (value_type)da(i);
|
||||
}
|
||||
, result_sum
|
||||
);
|
||||
|
||||
// printf("TestDynamicView::run(%d) run functor test\n",arg_total_size);
|
||||
ASSERT_EQ(result_sum, (value_type)( da_size * (da_size - 1) / 2 ) );
|
||||
#endif
|
||||
#endif
|
||||
|
||||
Kokkos::parallel_reduce( TestPolicy(league_size,team_size) , functor , reference);
|
||||
execution_space::fence();
|
||||
// add 3x more entries i.e. 4x larger than previous size
|
||||
// the first 1/4 should remain the same
|
||||
unsigned da_resize = arg_total_size / 2;
|
||||
da.resize_serial(da_resize);
|
||||
ASSERT_EQ( da.size(), da_resize );
|
||||
|
||||
#if defined( KOKKOS_ENABLE_CXX11_DISPATCH_LAMBDA )
|
||||
#if !defined(KOKKOS_ENABLE_CUDA) || ( 8000 <= CUDA_VERSION )
|
||||
Kokkos::parallel_for( Kokkos::RangePolicy<execution_space>(da_size, da_resize), KOKKOS_LAMBDA ( const int i )
|
||||
{
|
||||
da(i) = Scalar(i);
|
||||
}
|
||||
);
|
||||
|
||||
// printf("TestDynamicView::run(%d) run functor verify\n",arg_total_size);
|
||||
value_type new_result_sum = 0.0;
|
||||
Kokkos::parallel_reduce( Kokkos::RangePolicy<execution_space>(da_size, da_resize), KOKKOS_LAMBDA ( const int i, value_type& partial_sum )
|
||||
{
|
||||
partial_sum += (value_type)da(i);
|
||||
}
|
||||
, new_result_sum
|
||||
);
|
||||
|
||||
Kokkos::parallel_reduce( VerifyPolicy(league_size,team_size) , functor , result );
|
||||
execution_space::fence();
|
||||
ASSERT_EQ(new_result_sum+result_sum, (value_type)( da_resize * (da_resize - 1) / 2 ) );
|
||||
#endif
|
||||
#endif
|
||||
} // end scope
|
||||
|
||||
// printf("TestDynamicView::run(%d) done\n",arg_total_size);
|
||||
// Test: Create DynamicView, initialize size (via resize), run through parallel_for to set values, check values (via parallel_reduce); resize values and repeat
|
||||
// Case 3: resize reduces the size
|
||||
{
|
||||
view_type da("da", 1023, arg_total_size );
|
||||
ASSERT_EQ( da.size(), 0 );
|
||||
// Init
|
||||
unsigned da_size = arg_total_size / 2;
|
||||
da.resize_serial(da_size);
|
||||
ASSERT_EQ( da.size(), da_size );
|
||||
|
||||
#if defined( KOKKOS_ENABLE_CXX11_DISPATCH_LAMBDA )
|
||||
#if !defined(KOKKOS_ENABLE_CUDA) || ( 8000 <= CUDA_VERSION )
|
||||
Kokkos::parallel_for( Kokkos::RangePolicy<execution_space>(0, da_size), KOKKOS_LAMBDA ( const int i )
|
||||
{
|
||||
da(i) = Scalar(i);
|
||||
}
|
||||
);
|
||||
|
||||
value_type result_sum = 0.0;
|
||||
Kokkos::parallel_reduce( Kokkos::RangePolicy<execution_space>(0, da_size), KOKKOS_LAMBDA ( const int i, value_type& partial_sum )
|
||||
{
|
||||
partial_sum += (value_type)da(i);
|
||||
}
|
||||
, result_sum
|
||||
);
|
||||
|
||||
ASSERT_EQ(result_sum, (value_type)( da_size * (da_size - 1) / 2 ) );
|
||||
#endif
|
||||
#endif
|
||||
|
||||
// remove the final 3/4 entries i.e. first 1/4 remain
|
||||
unsigned da_resize = arg_total_size / 8;
|
||||
da.resize_serial(da_resize);
|
||||
ASSERT_EQ( da.size(), da_resize );
|
||||
|
||||
#if defined( KOKKOS_ENABLE_CXX11_DISPATCH_LAMBDA )
|
||||
#if !defined(KOKKOS_ENABLE_CUDA) || ( 8000 <= CUDA_VERSION )
|
||||
Kokkos::parallel_for( Kokkos::RangePolicy<execution_space>(0, da_resize), KOKKOS_LAMBDA ( const int i )
|
||||
{
|
||||
da(i) = Scalar(i);
|
||||
}
|
||||
);
|
||||
|
||||
value_type new_result_sum = 0.0;
|
||||
Kokkos::parallel_reduce( Kokkos::RangePolicy<execution_space>(0, da_resize), KOKKOS_LAMBDA ( const int i, value_type& partial_sum )
|
||||
{
|
||||
partial_sum += (value_type)da(i);
|
||||
}
|
||||
, new_result_sum
|
||||
);
|
||||
|
||||
ASSERT_EQ(new_result_sum, (value_type)( da_resize * (da_resize - 1) / 2 ) );
|
||||
#endif
|
||||
#endif
|
||||
} // end scope
|
||||
|
||||
}
|
||||
};
|
||||
|
|
|
@ -35,7 +35,7 @@
|
|||
// NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE OF THIS
|
||||
// SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.
|
||||
//
|
||||
// Questions? Contact H. Carter Edwards (hcedwar@sandia.gov)
|
||||
// Questions? Contact Christian R. Trott (crtrott@sandia.gov)
|
||||
//
|
||||
// ************************************************************************
|
||||
//@HEADER
|
||||
|
|
|
@ -35,7 +35,7 @@
|
|||
// NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE OF THIS
|
||||
// SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.
|
||||
//
|
||||
// Questions? Contact H. Carter Edwards (hcedwar@sandia.gov)
|
||||
// Questions? Contact Christian R. Trott (crtrott@sandia.gov)
|
||||
//
|
||||
// ************************************************************************
|
||||
//@HEADER
|
||||
|
@ -79,13 +79,10 @@ protected:
|
|||
static void SetUpTestCase()
|
||||
{
|
||||
std::cout << std::setprecision(5) << std::scientific;
|
||||
|
||||
Kokkos::OpenMP::initialize();
|
||||
}
|
||||
|
||||
static void TearDownTestCase()
|
||||
{
|
||||
Kokkos::OpenMP::finalize();
|
||||
}
|
||||
};
|
||||
|
||||
|
|
|
@ -35,7 +35,7 @@
|
|||
// NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE OF THIS
|
||||
// SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.
|
||||
//
|
||||
// Questions? Contact H. Carter Edwards (hcedwar@sandia.gov)
|
||||
// Questions? Contact Christian R. Trott (crtrott@sandia.gov)
|
||||
//
|
||||
// ************************************************************************
|
||||
//@HEADER
|
||||
|
|
|
@ -35,7 +35,7 @@
|
|||
// NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE OF THIS
|
||||
// SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.
|
||||
//
|
||||
// Questions? Contact H. Carter Edwards (hcedwar@sandia.gov)
|
||||
// Questions? Contact Christian R. Trott (crtrott@sandia.gov)
|
||||
//
|
||||
// ************************************************************************
|
||||
//@HEADER
|
||||
|
@ -81,7 +81,7 @@ void test_scatter_view_config(int n)
|
|||
}
|
||||
#if defined( KOKKOS_ENABLE_CXX11_DISPATCH_LAMBDA )
|
||||
auto host_view = Kokkos::create_mirror_view_and_copy(Kokkos::HostSpace(), original_view);
|
||||
for (typename decltype(host_view)::size_type i = 0; i < host_view.dimension_0(); ++i) {
|
||||
for (typename decltype(host_view)::size_type i = 0; i < host_view.extent(0); ++i) {
|
||||
auto val0 = host_view(i, 0);
|
||||
auto val1 = host_view(i, 1);
|
||||
auto val2 = host_view(i, 2);
|
||||
|
|
|
@ -35,7 +35,7 @@
|
|||
// NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE OF THIS
|
||||
// SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.
|
||||
//
|
||||
// Questions? Contact H. Carter Edwards (hcedwar@sandia.gov)
|
||||
// Questions? Contact Christian R. Trott (crtrott@sandia.gov)
|
||||
//
|
||||
// ************************************************************************
|
||||
//@HEADER
|
||||
|
@ -76,11 +76,9 @@ class serial : public ::testing::Test {
|
|||
protected:
|
||||
static void SetUpTestCase () {
|
||||
std::cout << std::setprecision(5) << std::scientific;
|
||||
Kokkos::Serial::initialize ();
|
||||
}
|
||||
|
||||
static void TearDownTestCase () {
|
||||
Kokkos::Serial::finalize ();
|
||||
}
|
||||
};
|
||||
|
||||
|
|
|
@ -35,7 +35,7 @@
|
|||
// NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE OF THIS
|
||||
// SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.
|
||||
//
|
||||
// Questions? Contact H. Carter Edwards (hcedwar@sandia.gov)
|
||||
// Questions? Contact Christian R. Trott (crtrott@sandia.gov)
|
||||
//
|
||||
// ************************************************************************
|
||||
//@HEADER
|
||||
|
@ -73,7 +73,7 @@ void run_test_graph()
|
|||
dx = Kokkos::create_staticcrsgraph<dView>( "dx" , graph );
|
||||
hx = Kokkos::create_mirror( dx );
|
||||
|
||||
ASSERT_EQ( hx.row_map.dimension_0() - 1 , LENGTH );
|
||||
ASSERT_EQ( hx.row_map.extent(0) - 1 , LENGTH );
|
||||
|
||||
for ( size_t i = 0 ; i < LENGTH ; ++i ) {
|
||||
const size_t begin = hx.row_map[i];
|
||||
|
@ -115,17 +115,17 @@ void run_test_graph2()
|
|||
hView hx = Kokkos::create_mirror( dx );
|
||||
hView mx = Kokkos::create_mirror( dx );
|
||||
|
||||
ASSERT_EQ( (size_t) dx.row_map.dimension_0() , (size_t) LENGTH + 1 );
|
||||
ASSERT_EQ( (size_t) hx.row_map.dimension_0() , (size_t) LENGTH + 1 );
|
||||
ASSERT_EQ( (size_t) mx.row_map.dimension_0() , (size_t) LENGTH + 1 );
|
||||
ASSERT_EQ( (size_t) dx.row_map.extent(0) , (size_t) LENGTH + 1 );
|
||||
ASSERT_EQ( (size_t) hx.row_map.extent(0) , (size_t) LENGTH + 1 );
|
||||
ASSERT_EQ( (size_t) mx.row_map.extent(0) , (size_t) LENGTH + 1 );
|
||||
|
||||
ASSERT_EQ( (size_t) dx.entries.dimension_0() , (size_t) total_length );
|
||||
ASSERT_EQ( (size_t) hx.entries.dimension_0() , (size_t) total_length );
|
||||
ASSERT_EQ( (size_t) mx.entries.dimension_0() , (size_t) total_length );
|
||||
ASSERT_EQ( (size_t) dx.entries.extent(0) , (size_t) total_length );
|
||||
ASSERT_EQ( (size_t) hx.entries.extent(0) , (size_t) total_length );
|
||||
ASSERT_EQ( (size_t) mx.entries.extent(0) , (size_t) total_length );
|
||||
|
||||
ASSERT_EQ( (size_t) dx.entries.dimension_1() , (size_t) 3 );
|
||||
ASSERT_EQ( (size_t) hx.entries.dimension_1() , (size_t) 3 );
|
||||
ASSERT_EQ( (size_t) mx.entries.dimension_1() , (size_t) 3 );
|
||||
ASSERT_EQ( (size_t) dx.entries.extent(1) , (size_t) 3 );
|
||||
ASSERT_EQ( (size_t) hx.entries.extent(1) , (size_t) 3 );
|
||||
ASSERT_EQ( (size_t) mx.entries.extent(1) , (size_t) 3 );
|
||||
|
||||
for ( size_t i = 0 ; i < LENGTH ; ++i ) {
|
||||
const size_t entry_begin = hx.row_map[i];
|
||||
|
@ -140,7 +140,7 @@ void run_test_graph2()
|
|||
Kokkos::deep_copy( dx.entries , hx.entries );
|
||||
Kokkos::deep_copy( mx.entries , dx.entries );
|
||||
|
||||
ASSERT_EQ( mx.row_map.dimension_0() , (size_t) LENGTH + 1 );
|
||||
ASSERT_EQ( mx.row_map.extent(0) , (size_t) LENGTH + 1 );
|
||||
|
||||
for ( size_t i = 0 ; i < LENGTH ; ++i ) {
|
||||
const size_t entry_begin = mx.row_map[i];
|
||||
|
|
|
@ -35,7 +35,7 @@
|
|||
// NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE OF THIS
|
||||
// SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.
|
||||
//
|
||||
// Questions? Contact H. Carter Edwards (hcedwar@sandia.gov)
|
||||
// Questions? Contact Christian R. Trott (crtrott@sandia.gov)
|
||||
//
|
||||
// ************************************************************************
|
||||
//@HEADER
|
||||
|
@ -79,25 +79,10 @@ protected:
|
|||
static void SetUpTestCase()
|
||||
{
|
||||
std::cout << std::setprecision(5) << std::scientific;
|
||||
|
||||
unsigned num_threads = 4;
|
||||
|
||||
if (Kokkos::hwloc::available()) {
|
||||
num_threads = Kokkos::hwloc::get_available_numa_count()
|
||||
* Kokkos::hwloc::get_available_cores_per_numa()
|
||||
// * Kokkos::hwloc::get_available_threads_per_core()
|
||||
;
|
||||
|
||||
}
|
||||
|
||||
std::cout << "Threads: " << num_threads << std::endl;
|
||||
|
||||
Kokkos::Threads::initialize( num_threads );
|
||||
}
|
||||
|
||||
static void TearDownTestCase()
|
||||
{
|
||||
Kokkos::Threads::finalize();
|
||||
}
|
||||
};
|
||||
|
||||
|
|
|
@ -34,7 +34,7 @@
|
|||
// NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE OF THIS
|
||||
// SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.
|
||||
//
|
||||
// Questions? Contact H. Carter Edwards (hcedwar@sandia.gov)
|
||||
// Questions? Contact Christian R. Trott (crtrott@sandia.gov)
|
||||
//
|
||||
// ************************************************************************
|
||||
//@HEADER
|
||||
|
|
|
@ -34,7 +34,7 @@
|
|||
// NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE OF THIS
|
||||
// SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.
|
||||
//
|
||||
// Questions? Contact H. Carter Edwards (hcedwar@sandia.gov)
|
||||
// Questions? Contact Christian R. Trott (crtrott@sandia.gov)
|
||||
//
|
||||
// ************************************************************************
|
||||
//@HEADER
|
||||
|
|
|
@ -35,7 +35,7 @@
|
|||
// NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE OF THIS
|
||||
// SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.
|
||||
//
|
||||
// Questions? Contact H. Carter Edwards (hcedwar@sandia.gov)
|
||||
// Questions? Contact Christian R. Trott (crtrott@sandia.gov)
|
||||
//
|
||||
// ************************************************************************
|
||||
//@HEADER
|
||||
|
|
|
@ -35,7 +35,7 @@
|
|||
// NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE OF THIS
|
||||
// SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.
|
||||
//
|
||||
// Questions? Contact H. Carter Edwards (hcedwar@sandia.gov)
|
||||
// Questions? Contact Christian R. Trott (crtrott@sandia.gov)
|
||||
//
|
||||
// ************************************************************************
|
||||
//@HEADER
|
||||
|
@ -43,10 +43,13 @@
|
|||
|
||||
#include <gtest/gtest.h>
|
||||
#include <cstdlib>
|
||||
#include <Kokkos_Macros.hpp>
|
||||
#include <Kokkos_Core.hpp>
|
||||
|
||||
int main(int argc, char *argv[]) {
|
||||
Kokkos::initialize(argc,argv);
|
||||
::testing::InitGoogleTest(&argc,argv);
|
||||
return RUN_ALL_TESTS();
|
||||
int result = RUN_ALL_TESTS();
|
||||
Kokkos::finalize();
|
||||
return result;
|
||||
}
|
||||
|
||||
|
|
|
@ -33,6 +33,7 @@ OBJ_PERF = PerfTestMain.o gtest-all.o
|
|||
OBJ_PERF += PerfTestGramSchmidt.o
|
||||
OBJ_PERF += PerfTestHexGrad.o
|
||||
OBJ_PERF += PerfTest_CustomReduction.o
|
||||
OBJ_PERF += PerfTest_ViewCopy.o
|
||||
TARGETS += KokkosCore_PerformanceTest
|
||||
TEST_TARGETS += test-performance
|
||||
|
||||
|
|
|
@ -35,7 +35,7 @@
|
|||
// NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE OF THIS
|
||||
// SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.
|
||||
//
|
||||
// Questions? Contact H. Carter Edwards (hcedwar@sandia.gov)
|
||||
// Questions? Contact Christian R. Trott (crtrott@sandia.gov)
|
||||
//
|
||||
// ************************************************************************
|
||||
//@HEADER
|
||||
|
@ -76,7 +76,7 @@ void axpby( const ConstScalarType & alpha ,
|
|||
{
|
||||
typedef AXPBY< ConstScalarType , ConstVectorType , VectorType > functor ;
|
||||
|
||||
parallel_for( Y.dimension_0() , functor( alpha , X , beta , Y ) );
|
||||
parallel_for( Y.extent(0) , functor( alpha , X , beta , Y ) );
|
||||
}
|
||||
|
||||
/** \brief Y *= alpha */
|
||||
|
@ -86,7 +86,7 @@ void scale( const ConstScalarType & alpha , const VectorType & Y )
|
|||
{
|
||||
typedef Scale< ConstScalarType , VectorType > functor ;
|
||||
|
||||
parallel_for( Y.dimension_0() , functor( alpha , Y ) );
|
||||
parallel_for( Y.extent(0) , functor( alpha , Y ) );
|
||||
}
|
||||
|
||||
template< class ConstVectorType ,
|
||||
|
@ -97,7 +97,7 @@ void dot( const ConstVectorType & X ,
|
|||
{
|
||||
typedef Dot< ConstVectorType > functor ;
|
||||
|
||||
parallel_reduce( X.dimension_0() , functor( X , Y ) , finalize );
|
||||
parallel_reduce( X.extent(0) , functor( X , Y ) , finalize );
|
||||
}
|
||||
|
||||
template< class ConstVectorType ,
|
||||
|
@ -107,7 +107,7 @@ void dot( const ConstVectorType & X ,
|
|||
{
|
||||
typedef DotSingle< ConstVectorType > functor ;
|
||||
|
||||
parallel_reduce( X.dimension_0() , functor( X ) , finalize );
|
||||
parallel_reduce( X.extent(0) , functor( X ) , finalize );
|
||||
}
|
||||
|
||||
} /* namespace Kokkos */
|
||||
|
|
|
@ -35,7 +35,7 @@
|
|||
// NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE OF THIS
|
||||
// SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.
|
||||
//
|
||||
// Questions? Contact H. Carter Edwards (hcedwar@sandia.gov)
|
||||
// Questions? Contact Christian R. Trott (crtrott@sandia.gov)
|
||||
//
|
||||
// ************************************************************************
|
||||
//@HEADER
|
||||
|
|
|
@ -35,7 +35,7 @@
|
|||
// NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE OF THIS
|
||||
// SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.
|
||||
//
|
||||
// Questions? Contact H. Carter Edwards (hcedwar@sandia.gov)
|
||||
// Questions? Contact Christian R. Trott (crtrott@sandia.gov)
|
||||
//
|
||||
// ************************************************************************
|
||||
//@HEADER
|
||||
|
@ -86,7 +86,7 @@ void invnorm2( const VectorView & x ,
|
|||
const ValueView & r ,
|
||||
const ValueView & r_inv )
|
||||
{
|
||||
Kokkos::parallel_reduce( x.dimension_0() , InvNorm2< VectorView , ValueView >( x , r , r_inv ) );
|
||||
Kokkos::parallel_reduce( x.extent(0) , InvNorm2< VectorView , ValueView >( x , r , r_inv ) );
|
||||
}
|
||||
|
||||
// PostProcess : tmp = - ( R(j,k) = result );
|
||||
|
@ -122,7 +122,7 @@ void dot_neg( const VectorView & x ,
|
|||
const ValueView & r ,
|
||||
const ValueView & r_neg )
|
||||
{
|
||||
Kokkos::parallel_reduce( x.dimension_0() , DotM< VectorView , ValueView >( x , y , r , r_neg ) );
|
||||
Kokkos::parallel_reduce( x.extent(0) , DotM< VectorView , ValueView >( x , y , r , r_neg ) );
|
||||
}
|
||||
|
||||
|
||||
|
@ -151,7 +151,7 @@ struct ModifiedGramSchmidt
|
|||
static double factorization( const multivector_type Q_ ,
|
||||
const multivector_type R_ )
|
||||
{
|
||||
const size_type count = Q_.dimension_1();
|
||||
const size_type count = Q_.extent(1);
|
||||
value_view tmp("tmp");
|
||||
value_view one("one");
|
||||
|
||||
|
|
|
@ -35,7 +35,7 @@
|
|||
// NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE OF THIS
|
||||
// SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.
|
||||
//
|
||||
// Questions? Contact H. Carter Edwards (hcedwar@sandia.gov)
|
||||
// Questions? Contact Christian R. Trott (crtrott@sandia.gov)
|
||||
//
|
||||
// ************************************************************************
|
||||
//@HEADER
|
||||
|
|
|
@ -35,7 +35,7 @@
|
|||
// NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE OF THIS
|
||||
// SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.
|
||||
//
|
||||
// Questions? Contact H. Carter Edwards (hcedwar@sandia.gov)
|
||||
// Questions? Contact Christian R. Trott (crtrott@sandia.gov)
|
||||
//
|
||||
// ************************************************************************
|
||||
//@HEADER
|
||||
|
|
|
@ -35,7 +35,7 @@
|
|||
// NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE OF THIS
|
||||
// SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.
|
||||
//
|
||||
// Questions? Contact H. Carter Edwards (hcedwar@sandia.gov)
|
||||
// Questions? Contact Christian R. Trott (crtrott@sandia.gov)
|
||||
//
|
||||
// ************************************************************************
|
||||
//@HEADER
|
||||
|
|
Some files were not shown because too many files have changed in this diff Show More
Loading…
Reference in New Issue