Try to reduce the number of global vars captured in the OpenMP regions by capturing them only the regions, which mark them as not-shared.
Added full support for parallel master taskloop simd directive.