# Change Log
2017-06-09 00:42:08 +08:00
## [2.03.05]( (2017-05-27)
[Full Changelog](
**Implemented enhancements:**
- Harmonize Custom Reductions over nesting levels [\#802](
- Prevent users directly including KokkosCore\_config.h [\#815](
- DualView aborts on concurrent host/device modify \(in debug mode\) [\#814](
- Abort when running on a NVIDIA CC5.0 or higher architecture with code compiled for CC \< 5.0 [\#813](
- Add "name" function to ExecSpaces [\#806](
- Allow null Future in task spawn dependences [\#795](
- Add Unit Tests for Kokkos::complex [\#785](
- Add pow function for Kokkos::complex [\#784](
- Square root of a complex [\#729](
- Command line processing of --threads argument prevents users from having any commandline arguments starting with --threads [\#760](
- Protected deprecated API with appropriate macro [\#756](
- Allow task scheduler memory pool to be used by tasks [\#747](
- View bounds checking on host-side performance: constructing a std::string [\#723](
- Add check for AppleClang as compiler distinct from check for Clang. [\#705](
- Uninclude source files for specific configurations to prevent link warning. [\#701](
- Add --small option to snapshot script [\#697](
- CMake Standalone Support [\#674](
- CMake build unit test and install [\#808](
- CMake: Fix having kokkos as a subdirectory in a pure cmake project [\#629](
- Tribits macro assumes build directory is in top level source directory [\#654](
- Use bin/nvcc\_wrapper, not config/nvcc\_wrapper [\#562](
- Allow MemoryPool::allocate\(\) to be called from multiple threads per warp. [\#487](
- Allow MemoryPool::allocate\\(\\) to be called from multiple threads per warp. [\#487](
- Move OpenMP 4.5 OpenMPTarget backend into Develop [\#456](
- Testing on ARM testbed [\#288](
**Fixed bugs:**
- Fix label in OpenMP parallel\_reduce verify\_initialized [\#834](
- TeamScratch Level 1 on Cuda hangs [\#820](
- \[bug\] memory pool. [\#786](
- Some Reduction Tests fail on Intel 18 with aggressive vectorization on [\#774](
- Error copying dynamic view on copy of memory pool [\#773](
- CUDA stack overflow with TaskDAG test [\#758](
- ThreadVectorRange Customized Reduction Bug [\#739](
- set\_scratch\_size overflows [\#726](
- Get wrong results for compiler checks in Makefile on OS X. [\#706](
- Fix check if multiple host architectures enabled. [\#702](
- Threads Backend Does not Pass on Cray Compilers [\#609](
- Rare bug in memory pool where allocation can finish on superblock in empty state [\#452](
- LDFLAGS in core/unit\_test/Makefile: potential "undefined reference" to pthread lib [\#148](
2017-04-26 03:48:51 +08:00
## [2.03.00]( (2017-04-25)
[Full Changelog](
**Implemented enhancements:**
- UnorderedMap: make it accept Devices or MemorySpaces [\#711](
- sort to accept DynamicView and \[begin,end\) indices [\#691](
- ENABLE Macros should only be used via \#ifdef or \#if defined [\#675](
- Remove impl/Kokkos\_Synchronic\_\* [\#666](
- Turning off IVDEP for Intel 14. [\#638](
- Using an installed Kokkos in a target application using CMake [\#633](
- Create Kokkos Bill of Materials [\#632](
- MDRangePolicy and tagged evaluators [\#547](
- Add PGI support [\#289](
**Fixed bugs:**
- Output from PerTeam fails [\#733](
- Cuda: architecture flag not added to link line [\#688](
- Getting large chunks of memory for a thread team in a universal way [\#664](
- Kokkos RNG normal\(\) function hangs for small seed value [\#655](
- Kokkos Tests Errors on Shepard/HSW Builds [\#644](
2017-02-14 01:50:34 +08:00
## [2.02.15]( (2017-02-10)
[Full Changelog](
**Implemented enhancements:**
- Containers: Adding block partitioning to StaticCrsGraph [\#625](
- Kokkos Make System can induce Errors on Cray Volta System [\#610](
- OpenMP: error out if KOKKOS\_HAVE\_OPENMP is defined but not \_OPENMP [\#605](
- CMake: fix standalone build with tests [\#604](
- Change README \(that GitHub shows when opening Kokkos project page\) to tell users how to submit PRs [\#597](
- Add correctness testing for all operators of Atomic View [\#420](
- Allow assignment of Views with compatible memory spaces [\#290](
- Build only one version of Kokkos library for tests [\#213](
- Clean out old KOKKOS\_HAVE\_CXX11 macros clauses [\#156](
- Harmonize Macro names [\#150](
**Fixed bugs:**
- Cray and PGI: Kokkos\_Parallel\_Reduce [\#634](
- Kokkos Make System can induce Errors on Cray Volta System [\#610](
- Normal\(\) function random number generator doesn't give the expected distribution [\#592](
2017-01-10 01:39:46 +08:00
## [2.02.07]( (2016-12-16)
[Full Changelog](
**Implemented enhancements:**
- Add CMake option to enable Cuda Lambda support [\#589](
- Add CMake option to enable Cuda RDC support [\#588](
- Add Initial Intel Sky Lake Xeon-HPC Compiler Support to Kokkos Make System [\#584](
- Building Tutorial Examples [\#582](
- Internal way for using ThreadVectorRange without TeamHandle [\#574](
- Testing: Add testing for uvm and rdc [\#571](
- Profiling: Add Memory Tracing and Region Markers [\#557](
- nvcc\_wrapper not installed with Kokkos built with CUDA through CMake [\#543](
- Improve DynRankView debug check [\#541](
- Benchmarks: Add Gather benchmark [\#536](
- Testing: add spot\_check option to test\_all\_sandia [\#535](
- Deprecate Kokkos::Impl::VerifyExecutionCanAccessMemorySpace [\#527](
- Add AtomicAdd support for 64bit float for Pascal [\#522](
- Add Restrict and Aligned memory trait [\#517](
- Kokkos Tests are Not Run using Compiler Optimization [\#501](
- Add support for clang 3.7 w/ openmp backend [\#393](
- Provide an error throw class [\#79](
**Fixed bugs:**
- Cuda UVM Allocation test broken with UVM as default space [\#586](
- Bug \(develop branch only\): multiple tests are now failing when forcing uvm usage. [\#570](
- Error in generate\ for Kokkos when Compiler is Empty String/Fails [\#568](
- XL 13.1.4 incorrect C++11 flag [\#553](
- Improve DynRankView debug check [\#541](
- Installing Library on MAC broken due to cp -u [\#539](
- Intel Nightly Testing with Debug enabled fails [\#534](
## [2.02.01]( (2016-11-01)
[Full Changelog](
**Implemented enhancements:**
- Add Changelog generation to our process. [\#506](
**Fixed bugs:**
- Test scratch\_request fails in Serial with Debug enabled [\#520](
- Bug In BoundsCheck for DynRankView [\#516](
## [2.02.00]( (2016-10-30)
[Full Changelog](
**Implemented enhancements:**
- Add PowerPC assembly for grabbing clock register in memory pool [\#511](
- Add GCC 6.x support [\#508](
- Test install and build against installed library [\#498](
- Makefile.kokkos adds expt-extended-lambda to cuda build with clang [\#490](
- Add top-level makefile option to just test kokkos-core unit-test [\#485](
- Split and harmonize Object Files of Core UnitTests to increase build parallelism [\#484](
- LayoutLeft to LayoutLeft subview for 3D and 4D views [\#473](
- Add official Cuda 8.0 support [\#468](
- Allow C++1Z Flag for Class Lambda capture [\#465](
- Add Clang 4.0+ compilation of Cuda code [\#455](
- Possible Issue with Intel 17.0.098 and GCC 6.1.0 in Develop Branch [\#445](
- Add name of view to "View bounds error" [\#432](
- Move Sort Binning Operators into Kokkos namespace [\#421](
- TaskPolicy - generate error when attempt to use uninitialized [\#396](
- Import WithoutInitializing and AllowPadding into Kokkos namespace [\#325](
- TeamThreadRange requires begin, end to be the same type [\#305](
- CudaUVMSpace should track \# allocations, due to CUDA limit on \# UVM allocations [\#300](
- Remove old View and its infrastructure [\#259](
**Fixed bugs:**
- Bug in TestCuda\_Other.cpp: most likely assembly inserted into Device code [\#515](
- Cuda Compute Capability check of GPU is outdated [\#509](
- multi\_scratch test with hwloc and pthreads seg-faults. [\#504](
- generate\_makefile.bash: "make install" is broken [\#503](
- make clean in Out of Source Build/Tests Does Not Work Correctly [\#502](
- Makefiles for test and examples have issues in Cuda when CXX is not explicitly specified [\#497](
- Dispatch lambda test directly inside GTEST macro doesn't work with nvcc [\#491](
- UnitTests with HWLOC enabled fail if run with mpirun bound to a single core [\#489](
- Failing Reducer Test on Mac with Pthreads [\#479](
- make test Dumps Error with Clang Not Found [\#471](
- OpenMP TeamPolicy member broadcast not using correct volatile shared variable [\#424](
- TaskPolicy - generate error when attempt to use uninitialized [\#396](
- New task policy implementation is pulling in old experimental code. [\#372](
- MemoryPool unit test hangs on Power8 with GCC 6.1.0 [\#298](
## [2.01.10]( (2016-09-27)
[Full Changelog](
**Implemented enhancements:**
- Enable Profiling by default in Tribits build [\#438](
- parallel\_reduce\(0\), parallel\_scan\(0\) unit tests [\#436](
- data\(\)==NULL after realloc with LayoutStride [\#351](
- Fix tutorials to track new Kokkos::View [\#323](
- Rename team policy set\_scratch\_size. [\#195](
**Fixed bugs:**
- Possible Issue with Intel 17.0.098 and GCC 6.1.0 in Develop Branch [\#445](
- Makefile spits syntax error [\#435](
- Kokkos::sort fails for view with all the same values [\#422](
- Generic Reducers: can't accept inline constructed reducer [\#404](
- data\\(\\)==NULL after realloc with LayoutStride [\#351](
- const subview of const view with compile time dimensions on Cuda backend [\#310](
- Kokkos \(in Trilinos\) Causes Internal Compiler Error on CUDA 8.0.21-EA on POWER8 [\#307](
- Core Oversubscription Detection Broken? [\#159](
## [2.01.06]( (2016-09-02)
[Full Changelog](
**Implemented enhancements:**
- Add "standard" reducers for lambda-supportable customized reduce [\#411](
- TaskPolicy - single thread back-end execution [\#390](
- Kokkos master clone tag [\#387](
- Query memory requirements from task policy [\#378](
- Output order of test\_atomic.cpp is confusing [\#373](
- Missing testing for atomics [\#341](
- Feature request for Kokkos to provide Kokkos::atomic\_fetch\_max and atomic\_fetch\_min [\#336](
- TaskPolicy\<Cuda\> performance requires teams mapped to warps [\#218](
**Fixed bugs:**
- Reduce with Teams broken for custom initialize [\#407](
- Failing Kokkos build on Debian [\#402](
- Failing Tests on NVIDIA Pascal GPUs [\#398](
- Algorithms: fill\_random assumes dimensions fit in unsigned int [\#389](
- Kokkos::subview with RandomAccess Memory Trait [\#385](
- Build warning \(signed / unsigned comparison\) in Cuda implementation [\#365](
- wrong results for a parallel\_reduce with CUDA8 / Maxwell50 [\#352](
- Hierarchical parallelism - 3 level unit test [\#344](
- Can I allocate a View w/ both WithoutInitializing & AllowPadding? [\#324](
- subview View layout determination [\#309](
- Unit tests with Cuda - Maxwell [\#196](
## [2.01.00]( (2016-07-21)
[Full Changelog](
**Implemented enhancements:**
- Edit ViewMapping so assigning Views with the same custom layout compiles when const casting [\#327](
- DynRankView: Performance improvement for operator\(\) [\#321](
- Interoperability between static and dynamic rank views [\#295](
- subview member function ? [\#280](
- Inter-operatibility between View and DynRankView. [\#245](
- \(Trilinos\) build warning in atomic\_assign, with Kokkos::complex [\#177](
- View\<\>::shmem\_size should runtime check for number of arguments equal to rank [\#176](
- Custom reduction join via lambda argument [\#99](
- DynRankView with 0 dimensions passed in at construction [\#293](
- Inject view\_alloc and friends into Kokkos namespace [\#292](
- Less restrictive TeamPolicy reduction on Cuda [\#286](
- deep\_copy using remap with source execution space [\#267](
- Suggestion: Enable opt-in L1 caching via nvcc-wrapper [\#261](
- More flexible create\_mirror functions [\#260](
- Rename View::memory\_span to View::required\_allocation\_size [\#256](
- Use of subviews and views with compile-time dimensions [\#237](
- Use of subviews and views with compile-time dimensions [\#237](
- Kokkos::Timer [\#234](
- Fence CudaUVMSpace allocations [\#230](
- View::operator\(\) accept std::is\_integral and std::is\_enum [\#227](
- Allocating zero size View [\#216](
- Thread scalable memory pool [\#212](
- Add a way to disable memory leak output [\#194](
- Kokkos exec space init should init Kokkos profiling [\#192](
- Runtime rank wrapper for View [\#189](
- Profiling Interface [\#158](
- Fix View assignment \(of managed to unmanaged\) [\#153](
- Add unit test for assignment of managed View to unmanaged View [\#152](
- Check for oversubscription of threads with MPI in Kokkos::initialize [\#149](
- Dynamic resizeable 1dimensional view [\#143](
- Develop TaskPolicy for CUDA [\#142](
- New View : Test Compilation Downstream [\#138](
- New View Implementation [\#135](
- Add variant of subview that lets users add traits [\#134](
- NVCC-WRAPPER: Add --host-only flag [\#121](
- Address gtest issue with TriBITS Kokkos build outside of Trilinos [\#117](
- Make tests pass with -expt-extended-lambda on CUDA [\#108](
- Dynamic scheduling for parallel\_for and parallel\_reduce [\#106](
- Runtime or compile time error when reduce functor's join is not properly specified as const member function or with volatile arguments [\#105](
- Error out when the number of threads is modified after kokkos is initialized [\#104](
- Porting to POWER and remove assumption of X86 default [\#103](
- Dynamic scheduling option for RangePolicy [\#100](
- SharedMemory Support for Lambdas [\#81](
- Recommended TeamSize for Lambdas [\#80](
- Add Aggressive Vectorization Compilation mode [\#72](
- Dynamic scheduling team execution policy [\#53](
- UVM allocations in multi-GPU systems [\#50](
- Synchronic in Kokkos::Impl [\#44](
- index and dimension types in for loops [\#28](
- Subview assign of 1D Strided with stride 1 to LayoutLeft/Right [\#1](
**Fixed bugs:**
- misspelled variable name in Kokkos\_Atomic\_Fetch + missing unit tests [\#340](
- seg fault Kokkos::Impl::CudaInternal::print\_configuration [\#338](
- Clang compiler error with named parallel\_reduce, tags, and TeamPolicy. [\#335](
- Shared Memory Allocation Error at parallel\_reduce [\#311](
- DynRankView: Fix resize and realloc [\#303](
- Scratch memory and dynamic scheduling [\#279](
- MemoryPool infinite loop when out of memory [\#312](
- Kokkos DynRankView changes break Sacado and Panzer [\#299](
- MemoryPool fails to compile on non-cuda non-x86 [\#297](
- Random Number Generator Fix [\#296](
- View template parameter ordering Bug [\#282](
- Serial task policy broken. [\#281](
- deep\_copy with LayoutStride should not memcpy [\#262](
- DualView::need\_sync should be a const method [\#248](
- Arbitrary-sized atomics on GPUs broken; loop forever [\#238](
- boolean reduction value\_type changes answer [\#225](
- Custom init\(\) function for parallel\_reduce with array value\_type [\#210](
- unit\_test Makefile is Broken - Recursively Calls itself until Machine Apocalypse. [\#202](
- nvcc\_wrapper Does Not Support -Xcompiler \<compiler option\> [\#198](
- Kokkos exec space init should init Kokkos profiling [\#192](
- Kokkos Threads Backend impl\_shared\_alloc Broken on Intel 16.1 \(Shepard Haswell\) [\#186](
- pthread back end hangs if used uninitialized [\#182](
- parallel\_reduce of size 0, not calling init/join [\#175](
- Bug in Threads with OpenMP enabled [\#173](
- KokkosExp\_SharedAlloc, m\_team\_work\_index inaccessible [\#166](
- 128-bit CAS without Assembly Broken? [\#161](
- fatal error: Cuda/Kokkos\_Cuda\_abort.hpp: No such file or directory [\#157](
- Power8: Fix OpenMP backend [\#139](
- Data race in Kokkos OpenMP initialization [\#131](
- parallel\_launch\_local\_memory and cuda 7.5 [\#125](
- Resize can fail with Cuda due to asynchronous dispatch [\#119](
- Qthread taskpolicy initialization bug. [\#92](
- Windows: sys/mman.h [\#89](
- Windows: atomic\_fetch\_sub\(\) [\#88](
- Windows: snprintf [\#87](
- Parallel\_Reduce with TeamPolicy and league size of 0 returns garbage [\#85](
- Throw with Cuda when using \(2D\) team\_policy parallel\_reduce with less than a warp size [\#76](
- Scalar views don't work with Kokkos::Atomic memory trait [\#69](
- Reduce the number of threads per team for Cuda [\#63](
- Named Kernels fail for reductions with CUDA [\#60](
- Kokkos View dimension\_\(\) for long returning unsigned int [\#20](
- atomic test hangs with LLVM [\#6](
- OpenMP Test should set omp\_set\_num\_threads to 1 [\#4](
**Closed issues:**
- develop branch broken with CUDA 8 and --expt-extended-lambda [\#354](
- --arch=KNL with Intel 2016 build failure [\#349](
- Error building with Cuda when passing -DKOKKOS\_CUDA\_USE\_LAMBDA to generate\_makefile.bash [\#343](
- Can I safely use int indices in a 2-D View with capacity \> 2B? [\#318](
- Kokkos::ViewAllocateWithoutInitializing is not working [\#317](
- Intel build on Mac OS X [\#277](
- deleted [\#271](
- Broken Mira build [\#268](
- 32-bit build [\#246](
- parallel\_reduce with RDC crashes linker [\#232](
- build of Kokkos\_Sparse\_MV\_impl\_spmv\_Serial.cpp.o fails if you use nvcc and have cuda disabled [\#209](
- Kokkos Serial execution space is not tested with TeamPolicy. [\#207](
- Unit test failure on Hansen KokkosCore\_UnitTest\_Cuda\_MPI\_1 [\#200](
- nvcc compiler warning: calling a \_\_host\_\_ function from a \_\_host\_\_ \_\_device\_\_ function is not allowed [\#180](
- Intel 15 build error with defaulted "move" operators [\#171](
- missing libkokkos.a during Trilinos 12.4.2 build, yet other libkokkos\*.a libs are there [\#165](
- Tie atomic updates to execution space or even to thread team? \(speculation\) [\#144](
- New View: Compiletime/size Test [\#137](
- New View : Performance Test [\#136](
- Signed/unsigned comparison warning in CUDA parallel [\#130](
- Kokkos::complex: Need op\* w/ std::complex & real [\#126](
- Use uintptr\_t for casting pointers [\#110](
- Default thread mapping behavior between P and Q threads. [\#91](
- Windows: Atomic\_Fetch\_Exchange\(\) return type [\#90](
- Synchronic unit test is way too long [\#84](
- nvcc\_wrapper -\> $\(NVCC\_WRAPPER\) [\#42](
- Check compiler version and print helpful message [\#39](
- Kokkos shared memory on Cuda uses a lot of registers [\#31](
- Can not pass unit test `` without a GT 720 [\#25](
- Makefile.kokkos lacks bounds checking option that CMake has [\#24](
- Kokkos can not complete unit tests with CUDA UVM enabled [\#23](
- Simplify teams + shared memory histogram example to remove vectorization [\#21](
- Kokkos needs to rever to ${PROJECT\_NAME}\_ENABLE\_CXX11 not Trilinos\_ENABLE\_CXX11 [\#17](
- Kokkos Base Makefile adds AVX to KNC Build [\#16](
- MS Visual Studio 2013 Build Errors [\#9](
- subview\(X, ALL\(\), j\) for 2-D LayoutRight View X: should it view a column? [\#5](
## [End_C++98]( (2015-04-15)
