llvm-project

Commit Graph

Author	SHA1	Message	Date
Jonathan Peyton	6d88e049dc	[OpenMP] Implement OpenMP 5.0 affinity format functionality This patch adds the affinity format functionality introduced in OpenMP 5.0. This patch adds: Two new environment variables: OMP_DISPLAY_AFFINITY=TRUE\|FALSE OMP_AFFINITY_FORMAT=<string> and Four new API: 1) omp_set_affinity_format() 2) omp_get_affinity_format() 3) omp_display_affinity() 4) omp_capture_affinity() The affinity format functionality has two ICV's associated with it: affinity-display-var (bool) and affinity-format-var (string). The affinity-display-var enables/disables the functionality through the envirable OMP_DISPLAY_AFFINITY. The affinity-format-var is a formatted string with the special field types beginning with a '%' character similar to printf For example, the affinity-format-var could be: "OMP: host:%H pid:%P OStid:%i num_threads:%N thread_num:%n affinity:{%A}" The affinity-format-var is displayed by every thread implicitly at the beginning of a parallel region when any thread's affinity has changed (including a brand new thread being spawned), or explicitly using the omp_display_affinity() API. The omp_capture_affinity() function can capture the affinity-format-var in a char buffer. And omp_set\|get_affinity_format() allow the user to set\|get the affinity-format-var explicitly at runtime. omp_capture_affinity() and omp_get_affinity_format() both return the number of characters needed to hold the entire string it tried to make (not including NULL character). If not enough buffer space is available, both these functions truncate their output. Differential Revision: https://reviews.llvm.org/D55148 llvm-svn: 349089	2018-12-13 23:14:24 +00:00
Jonathan Peyton	e525f0d4e2	[OpenMP] Fix balanced affinity so thread's private affinity mask is updated Balanced affinity only updated the thread's affinity with the operating system. This change also has the thread's private mask reflect that change as well so that any API that probes the thread's affinity mask will report the correct mask value. Differential Revision: https://reviews.llvm.org/D52379 llvm-svn: 343142	2018-09-26 20:43:23 +00:00
Jonathan Peyton	2c3e5d82b4	[OpenMP] Fixed affinity verbose double printing for balanced type. llvm-svn: 340647	2018-08-24 20:35:42 +00:00
Jonathan Peyton	baad3f6016	[OpenMP] Cleanup code This patch cleans up unused functions, variables, sign compare issues, and addresses some -Warning flags which are now enabled including -Wcast-qual. Not all the warning flags in LibompHandleFlags.cmake are enabled, but some are with this patch. Some __kmp_gtid_from_* macros in kmp.h are switched to static inline functions which allows us to remove the awkward definition of KMP_DEBUG_ASSERT() and KMP_ASSERT() macros which used the comma operator. This had to be done for the innumerable -Wunused-value warnings related to KMP_DEBUG_ASSERT() Differential Revision: https://reviews.llvm.org/D49105 llvm-svn: 339393	2018-08-09 22:04:30 +00:00
Jonathan Peyton	f639936748	[OpenMP] Introduce hierarchical scheduling This patch introduces the logic implementing hierarchical scheduling. First and foremost, hierarchical scheduling is off by default To enable, use -DLIBOMP_USE_HIER_SCHED=On during CMake's configure stage. This work is based off if the IWOMP paper: "Workstealing and Nested Parallelism in SMP Systems" Hierarchical scheduling is the layering of OpenMP schedules for different layers of the memory hierarchy. One can have multiple layers between the threads and the global iterations space. The threads will go up the hierarchy to grab iterations, using possibly a different schedule & chunk for each layer. [ Global iteration space (0-999) ] (use static) [ L1 \| L1 \| L1 \| L1 ] (use dynamic,1) [ T0 T1 \| T2 T3 \| T4 T5 \| T6 T7 ] In the example shown above, there are 8 threads and 4 L1 caches begin targeted. If the topology indicates that there are two threads per core, then two consecutive threads will share the data of one L1 cache unit. This example would have the iteration space (0-999) split statically across the four L1 caches (so the first L1 would get (0-249), the second would get (250-499), etc). Then the threads will use a dynamic,1 schedule to grab iterations from the L1 cache units. There are currently four supported layers: L1, L2, L3, NUMA OMP_SCHEDULE can now read a hierarchical schedule with this syntax: OMP_SCHEDULE='EXPERIMENTAL LAYER,SCHED[,CHUNK][:LAYER,SCHED[,CHUNK]...]:SCHED,CHUNK And OMP_SCHEDULE can still read the normal SCHED,CHUNK syntax from before I've kept most of the hierarchical scheduling logic inside kmp_dispatch_hier.h to try to keep it separate from the rest of the code. Differential Revision: https://reviews.llvm.org/D47962 llvm-svn: 336571	2018-07-09 17:51:13 +00:00
Jonathan Peyton	1482db9e03	[OpenMP] Fix affinity API for KMP_AFFINITY=none\|compact\|scatter Currently, the affinity API reports garbage for the initial place list and any thread's place lists when using KMP_AFFINITY=none\|compact\|scatter. This patch does two things: for KMP_AFFINITY=none, Creates a one entry table for the places, this way, the initial place list is just a single place with all the proc ids in it. We also set the initial place of any thread to 0 instead of KMP_PLACE_ALL so that the thread reports that single place (place 0) instead of garbage (-1) when using the affinity API. When non-OMP_PROC_BIND affinity is used (including KMP_AFFINITY=compact\|scatter), a thread's place list is populated correctly. We assume that each thread is assigned to a single place. This is implemented in two of the affinity API functions Differential Revision: https://reviews.llvm.org/D45527 llvm-svn: 330283	2018-04-18 19:25:48 +00:00
Paul Osmialowski	7634f7093a	[AArch64] fix an issue with older /proc/cpuinfo layout There are two /proc/cpuinfo layots in use for AArch64: old and new. The old one has all 'processor : n' lines in one section, hence checking for duplications does not make sense. Differential Revision: https://reviews.llvm.org/D41000 llvm-svn: 320593	2017-12-13 16:12:24 +00:00
Jonas Hahnfeld	ce528acf0d	Fix thread affinity on non-x86 Linux To make thread affinity work according to the OpenMP spec, the runtime needs information about the hardware topology. On Linux the default way is to parse /proc/cpuinfo which contains this information for x86 machines but (at least) not for AArch64 and Power architectures. Fortunately, there is a different code path which is able to get that data from sysfs. The needed patch has landed in 2006 for Linux 2.6.16 which is safe to assume nowadays (even RHEL 5 had a kernel version derived from 2.6.18, and we are now at RHEL 7!). Differential Revision: https://reviews.llvm.org/D40357 llvm-svn: 320151	2017-12-08 15:07:05 +00:00
Jonathan Peyton	125203e003	Eliminate double printing of verbose affinity settings Redundant extra verbose output of binding to full mask in case affinity=balanced or OMP_PLACES=<any> or OMP_PROC_BIND=<any> Differential Revision: https://reviews.llvm.org/D40624 llvm-svn: 319960	2017-12-06 21:07:41 +00:00
Andrey Churbanov	a5868215b4	Extension of HWLOC topology discovery with NUMA nodes and tiles Patch by Olga Malysheva Differential Revision: https://reviews.llvm.org/D40309 llvm-svn: 319422	2017-11-30 11:51:47 +00:00
Jonathan Peyton	64249504b5	Warning is emitted when tiles are requested but cannot be used Added two warnings: 1) Before building the topology map check if tiles are requested but the topo method is not hwloc; 2) After building the topology map check if tiles are requested but not detected by the library. Patch by Olga Malysheva Differential Revision: https://reviews.llvm.org/D40340 llvm-svn: 319374	2017-11-29 22:27:18 +00:00
Jonathan Peyton	94a114fc39	Apply formatting changes .clang-format's comments are removed and a (hopefully) final set of formatting changes are applied. Differential Revision: https://reviews.llvm.org/D38837 Differential Revision: https://reviews.llvm.org/D38920 llvm-svn: 316227	2017-10-20 19:30:57 +00:00
Jonathan Peyton	bd3a7633f1	Remove unnecessary semicolons Removes semicolons after if {} blocks, function definitions, etc. I was able to apply the large OMPT patch cleanly on top of this one with no conflicts. llvm-svn: 314340	2017-09-27 20:36:27 +00:00
Jonathan Peyton	6a393f75f4	Minor code cleanup of Klocwork issues Minor code cleanup of Klocwork issues. Fatal messages are given no return attribute. Define and use KMP_NORETURN to work for multiple C++ versions. Patch by Terry Wilmarth Differential Revision: https://reviews.llvm.org/D37275 llvm-svn: 312538	2017-09-05 15:43:58 +00:00
Andrey Churbanov	5ba90c7979	OpenMP RTL cleanup: eliminated warnings with -Wcast-qual, patch 2. Changes are: got all atomics to accept volatile pointers that allowed to simplify many type conversions. Windows specific code fixed correspondingly. Differential Revision: https://reviews.llvm.org/D35417 llvm-svn: 308164	2017-07-17 09:03:14 +00:00
Andrey Churbanov	c47afcd9bb	OpenMP RTL cleanup: eliminated warnings with -Wcast-qual. Changes are: replaced C-style casts with cons_cast and reinterpret_cast; type of several counters changed to signed; type of parameters of 32-bit and 64-bit AND and OR intrinsics changes to unsigned; changed files formatted using clang-format version 3.8.1. Differential Revision: https://reviews.llvm.org/D34759 llvm-svn: 307020	2017-07-03 11:24:08 +00:00
Jonathan Peyton	642688b632	Fix minor formatting issues Some code was restructured to move it under KMP_DEBUG. The rest is formatting changes to fix some things broken by clang-format Patch by Terry Wilmarth Differential Revision: https://reviews.llvm.org/D33744 llvm-svn: 304438	2017-06-01 16:46:36 +00:00
Jonathan Peyton	e3e2aaf68d	Fix for KMP_AFFINITY=disabled and KMP_TOPOLOGY_METHOD=hwloc With these settings, the create_hwloc_map() method was being called causing an assert(). After some consideration, it was determined that disabling affinity explicitly should just disable hwloc as well. i.e., KMP_AFFINITY overrides KMP_TOPOLOGY_METHOD. This lets the user know that the Hwloc mechanism is being ignored when KMP_AFFINITY=disabled. Differential Revision: https://reviews.llvm.org/D33208 llvm-svn: 304344	2017-05-31 20:35:22 +00:00
Jonathan Peyton	586849918b	Fix for KMP_AFFINITY=respect with multiple processor groups An assert() was being tripped when KMP_AFFINITY=respect + Multiple Processor Groups. Let __kmp_affinity_create_proc_group_map() function be able to create address2os object which contains a single group by deleting restriction that process affinity mask must span multiple groups. llvm-svn: 303101	2017-05-15 19:05:59 +00:00
Jonathan Peyton	3041982dd1	Clang-format and whitespace cleanup of source code This patch contains the clang-format and cleanup of the entire code base. Some of clang-formats changes made the code look worse in places. A best effort was made to resolve the bulk of these problems, but many remain. Most of the problems were mangling line-breaks and tabbing of comments. Patch by Terry Wilmarth Differential Revision: https://reviews.llvm.org/D32659 llvm-svn: 302929	2017-05-12 18:01:32 +00:00
Jonathan Peyton	20e13d4a38	Fix Hwloc API Incompatibility Older Hwloc libraries (< 1.10.0) don't offer the HWLOC_OBJ_NUMANODE nor HWLOC_OBJ_PACKAGE types. Instead they are named HWLOC_OBJ_NODE and HWLOC_OBJ_SOCKET instead. This patch just defines the newer names based on the older names when using an older Hwloc. Differential Revision: https://reviews.llvm.org/D32496 llvm-svn: 301349	2017-04-25 19:04:07 +00:00
Andrey Churbanov	4a9a89241b	KMP_HW_SUBSET extended with NUMA support when HWLOC enabled Differential Revision: https://reviews.llvm.org/D31600 llvm-svn: 300220	2017-04-13 17:15:07 +00:00
Jonathan Peyton	16fd8fec76	Fix incorrect initial value of __kmp_affinity_type. Affinity initialization code expects __kmp_affinity_type has the value affinity_default by default, but the cleanup code does not properly set the value back to affinity_default. This may introduce some issues when multiple roots are trying to initialize/uninitialize the runtime successively. Patch by Hansang Bae Differential Revision: https://reviews.llvm.org/D31012 llvm-svn: 298313	2017-03-20 22:04:02 +00:00
Jonathan Peyton	3061e3e454	Printing OS thread id, when KMP_AFFINITY is set. Patch by Vishakha Agrawal Differential Revision: https://reviews.llvm.org/D28873 llvm-svn: 293315	2017-01-27 18:04:33 +00:00
Jonas Hahnfeld	c9a8a6c030	kmp_affinity: Fix check if specific bit is set Clang 4.0 trunk warns: warning: logical not is only applied to the left hand side of this bitwise operator [-Wlogical-not-parentheses] This points to a potential bug if the code really wants to check if the single bit is not set: If for example (buf.edx >> 9) = 2 (has any bit set except the least significant one), 'logical not' will return 0 which stays 0 after the 'bitwise and'. To do this correctly we first need to evaluate the 'bitwise and'. In that case it returns 2 & 1 = 0 which after the 'logical not' evaluates to 1. Differential Revision: https://reviews.llvm.org/D28599 llvm-svn: 291764	2017-01-12 11:39:04 +00:00
Jonathan Peyton	1cdd87adfd	Introduce dynamic affinity dispatch capabilities This set of changes enables the affinity interface (Either the preexisting native operating system or HWLOC) to be dynamically set at runtime initialization. The point of this change is that we were seeing performance degradations when using HWLOC. This allows the user to use the old affinity mechanisms which on large machines (>64 cores) makes a large difference in initialization time. These changes mostly move affinity code under a small class hierarchy: KMPAffinity class Mask {} KMPNativeAffinity : public KMPAffinity class Mask : public KMPAffinity::Mask KMPHwlocAffinity class Mask : public KMPAffinity::Mask Since all interface functions (for both affinity and the mask implementation) are virtual, the implementation can be chosen at runtime initialization. Differential Revision: https://reviews.llvm.org/D26356 llvm-svn: 286890	2016-11-14 21:08:35 +00:00
Jonathan Peyton	7c465a5f41	Fix bitmask upper bounds check Rather than checking KMP_CPU_SETSIZE, which doesn't exist when using Hwloc, we use the get_max_proc() function which can vary based on the operating system. For example on Windows with multiple processor groups, it might be the case that the highest bit possible in the bitmask is not equal to the number of hardware threads on the machine but something higher than that. Differential Revision: https://reviews.llvm.org/D24206 llvm-svn: 281245	2016-09-12 19:02:53 +00:00
Jonathan Peyton	e6abe52905	Move function into cpp file under KMP_AFFINITY_SUPPORTED guard. When affinity isn't supported, __kmp_affinity_compact doesn't exist. The problem is that in kmp_affinity.h there is a function which uses it without the proper KMP_AFFINITY_SUPPORTED guard around it. The compiler was smart enough to ignore it and the function __kmp_affinity_cmp_Address_child_num which relies on it, but I think it is cleaner to have it under the proper guard. Since the function is only used in the kmp_affinity.cpp file and there aren't any plans to have it elsewhere. I have moved it there. llvm-svn: 280542	2016-09-02 20:54:58 +00:00
Jonathan Peyton	788c5d65e8	Replace a bad instance of __kmp_free() with KMP_CPU_FREE_ARRAY() macro. llvm-svn: 280530	2016-09-02 19:37:12 +00:00
Andrey Churbanov	5bf494e73d	Fixed x2APIC discovery for 256-processor architectures. Mask for value read from ebx register returned by CPUID expanded to 0xFFFF. Differential Revision: https://reviews.llvm.org/D23203 llvm-svn: 277825	2016-08-05 15:59:11 +00:00
Paul Osmialowski	ecbe2ea002	Make balanced affinity work on AArch64. This patch enables balanced affinity on machines that do not have hardware threads and have cores clustered into packages. In facts, balacing algorithm could be generalized for any arrangement with at least two levels of hierarchy (depth > 1). Differential Revision: https://reviews.llvm.org/D22365 llvm-svn: 277212	2016-07-29 20:55:03 +00:00
Andrey Churbanov	cb28d6e3a0	D22136: Memory leaks fixed by adding missed __kmp_free() calls llvm-svn: 274850	2016-07-08 14:40:20 +00:00
Jonathan Peyton	fd7cc42fed	Improvements to process affinity mask setting A couple improvements: 1) Add ability to limit fullMask size when KMP_HW_SUBSET limits resources. 2) Make KMP_HW_SUBSET work for affinity_none, and only limit fullMask in this case. Patch by Andrey Churbanov. Differential Revision: http://reviews.llvm.org/D21528 llvm-svn: 273278	2016-06-21 15:54:38 +00:00
Jonathan Peyton	bf35771bcc	Change hwloc discovery algorithm to print topology only for accessible resources Change hwloc discovery algorithm to print topology for only accessible resources, and report uniformity correspondingly, similar to what other topology discovery algorithms do. Fixes minor inconsistency in total topology reported and resources used for threads binding in case hwloc used. Patch by Andrey Churbanov. Differential Revision: http://reviews.llvm.org/D21389 llvm-svn: 272952	2016-06-16 20:31:19 +00:00
Jonathan Peyton	72a8498e08	Fixed missing memory cleanup in __kmp_affinity_create_hwloc_map() Cleanup: fixed missing memory cleanup in couple of corner cases. Fixes possible memory leak in some corner cases Patch by Andrey Churbanov Differential Revision: http://reviews.llvm.org/D21355 llvm-svn: 272946	2016-06-16 20:14:54 +00:00
Jonathan Peyton	b9d28fbeb3	Deprecate KMP_PLACE_THREADS and rename as KMP_HW_SUBSET Deprecate KMP_PLACE_THREADS and rename it to KMP_HW_SUBSET due to confusion about its purpose and function among users. KMP_HW_SUBSET is an environment variable which allows users to easily pick a subset of the hardware topology to use. e.g., KMP_HW_SUBSET=30c,2t means use 30 cores, 2 threads per core. Patch by Andrey Churbanov Differential Revision: http://reviews.llvm.org/D21340 llvm-svn: 272937	2016-06-16 18:53:48 +00:00
Jonathan Peyton	c5304aa3c4	Affinity mask processing improvements Remove static specifier from var fullMask and remove kmp_get_fullMask() routine. When iterating through procs in a mask, always check if proc is in fullMask (this check was missing in a few places). Patch by Brian Bliss. Differential Revision: http://reviews.llvm.org/D21300 llvm-svn: 272589	2016-06-13 21:28:03 +00:00
Jonathan Peyton	202a24dd9b	Hwloc refactoring patch These changes remove the hwloc_topology_ignore_type function which doesn't exist in the hwloc 2.0 API. In the existing code, the topology extracted from hwloc has the cache levels stripped out and then assumes the final stripped topology follows the typical three-level topology: packages -> cores -> HW threads. But the code is doing unclean manipulations to determine at what level those resources are located and also assumes too much about what hwloc is detecting (there could be intermediate levels in between socket and core for instance). This new way of extracting the topology doesn't strip out any hardware objects that hwloc detects. It does not assume the three level topology, and instead searches for the relevant three levels within the topology for each bit of information using hwloc interface functions. i.e., the three level topology subset that our affinity code is interested in is extracted from the hwloc topology tree directly. For example, the new __kmp_hwloc_get_nobjs_under_obj function gives the user the number of cores under a socket reliably without worrying if there are unexpected objects between the socket object and core object in the hwloc topology structure. Also, now that all topology information is kept, there are also possibilities of using the caches/numa nodes to determine more sophisticated affinity settings in the future. There is also some cleanup code added for the destruction of the __kmp_hwloc_topology object. Differential Revision: http://reviews.llvm.org/D21195 llvm-svn: 272565	2016-06-13 17:30:08 +00:00
Jonathan Peyton	8407f5b3bd	Remove architecture dependent Hwloc DEBUG section This debug sections's functionality can be replicated using the environment variable KMP_TOPOLOGY_METHOD with different values and KMP_AFFINITY=verbose llvm-svn: 267472	2016-04-25 21:11:26 +00:00
Jonathan Peyton	1d5487c5d0	Fix buffer problem with printing long Hwloc affinity mask This change has the hwloc_bitmap_list_snprintf() function use the entire buffer to print the mask. There is no need to shorten the buffer length by 7. It only needs to be shortened by one byte. llvm-svn: 267470	2016-04-25 21:08:31 +00:00
Jonathan Peyton	3076fa4c35	New API for restoring current thread's affinity to init affinity of application This new API, int kmp_set_thread_affinity_mask_initial(), is available for use by other parallel runtime libraries inside a possibly OpenMP-registered thread. This entry point restores the current thread's affinity mask to the affinity mask of the application when it first began. If -1 is returned it can be assumed that either the thread hasn't called affinity initialization or that the thread isn't registered with the OpenMP library. If 0 is returned then, then the call was successful. Any return value greater than zero indicates an error occurred when setting affinity. Differential Revision: http://reviews.llvm.org/D15867 llvm-svn: 257489	2016-01-12 17:21:55 +00:00
Jonathan Peyton	01dcf36bd5	Adding Hwloc library option for affinity mechanism These changes allow libhwloc to be used as the topology discovery/affinity mechanism for libomp. It is supported on Unices. The code additions: * Canonicalize KMP_CPU_* interface macros so bitmask operations are implementation independent and work with both hwloc bitmaps and libomp bitmaps. So there are new KMP_CPU_ALLOC_* and KMP_CPU_ITERATE() macros and the like. These are all in kmp.h and appropriately placed. * Hwloc topology discovery code in kmp_affinity.cpp. This uses the hwloc interface to create a libomp address2os object which the rest of libomp knows how to handle already. * To build, use -DLIBOMP_USE_HWLOC=on and -DLIBOMP_HWLOC_INSTALL_DIR=/path/to/install/dir [default /usr/local]. If CMake can't find the library or hwloc.h, then it will tell you and exit. Differential Revision: http://reviews.llvm.org/D13991 llvm-svn: 254320	2015-11-30 20:02:59 +00:00
Jonathan Peyton	7dee82e729	Improvements to machine_hierarchy code for re-sizing These changes include: 1) Machine hierarchy now uses the base_num_threads field to indicate the maximum number of threads the current hierarchy can handle without a resize. 2) In __kmp_get_hierarchy, we need to get depth after any potential resize is done. 3) Cleanup of hierarchy resize code to support 1 above. Differential Revision: http://reviews.llvm.org/D14455 llvm-svn: 252475	2015-11-09 16:24:53 +00:00
Jonathan Peyton	6778c73243	Fix OMP_PLACES negation operator parsing (!place) Just moved the *scan++ line up before the recursive call. Otherwise, infinite recursion occurs and leads to a segmentation fault. llvm-svn: 250729	2015-10-19 19:43:01 +00:00
Jonathan Peyton	dd4aa9b6b5	Added sockets to the syntax of KMP_PLACE_THREADS environment variable. Added (optional) sockets to the syntax of the KMP_PLACE_THREADS environment variable. Some limitations: * The number of sockets and then optional offset should be specified first (before other parameters). * The letter designation is mandatory for sockets and then for other parameters. * If number of cores is specified first, then the number of sockets is defaulted to all sockets on the machine; also, the old syntax is partially supported if sockets are skipped. * If number of threads per core is specified first, then the number of sockets and cores per socket are defaulted to all sockets and all cores per socket respectively. * The number of cores per socket cannot be specified before sockets or after threads per core. * The number of threads per core can be specified before or after core-offset (old syntax required it to be before core-offset); * Parameters delimiter can be: empty, comma, lower-case x; * Spaces are allowed around numbers, around letters, around delimiter. Approximate shorthand specification: KMP_PLACE_THREADS="[num_sockets(S\|s)[[delim]offset(O\|o)][delim]][num_cores_per_socket(C\|c)[[delim]offset(O\|o)][delim]][num_threads_per_core(T\|t)]" Differential Revision: http://reviews.llvm.org/D13175 llvm-svn: 249708	2015-10-08 17:55:54 +00:00
Jonathan Peyton	7edeef1bbf	Fix memory corruption in Windows debug library This patch adjusts the buffer size when reducing the buffer used for printing. This solves the memory corruption in Windows debug library, and potential memory corruption in other builds. llvm-svn: 248588	2015-09-25 17:23:17 +00:00
Jonathan Peyton	df4d3dd659	Fix depth field bug and resize() function in hierarchical barrier This is a follow up to the hierarchy cleanup patch. Added some clarifying comments to hierarchy_info. Fixed a bug with the depth field not being updated cleanly during a resize. Fixed resize to first check capacity as determined by maxLevels before actually doing the full resize. Differential Revision: http://reviews.llvm.org/D12562 llvm-svn: 247333	2015-09-10 20:34:32 +00:00
Jonathan Peyton	1707836b68	Cleanup of affinity hierarchy code. Some of this is improvement to code suggested by Hal Finkel. Four changes here: 1.Cleanup of hierarchy code to handle all hierarchy cases whether affinity is available or not 2.Separated this and other classes and common functions out to a header file 3.Added a destructor-like fini function for the hierarchy (and call in __kmp_cleanup) 4.Remove some redundant code that is hopefully no longer needed Differential Revision: http://reviews.llvm.org/D12449 llvm-svn: 247326	2015-09-10 19:22:07 +00:00
Jonathan Peyton	62f3840c9b	Fix machine topology pruning. This patch fixes a bug when eliminating layers in the machine topology (namely cores, and threads). Before this patch, if a user specifies using only one thread per socket, then affinity is not set properly due to bad topology pruning. Differential Revision: http://reviews.llvm.org/D11158 llvm-svn: 245966	2015-08-25 18:44:41 +00:00
Jonathan Peyton	7f09a98ab1	Allow machine hierarchy expansion This fix allows the machine hierarchy to be expanded in case it needs to handle more threads. It adds a resize function to accomplish this. Differential Revision: http://reviews.llvm.org/D9900 llvm-svn: 240292	2015-06-22 15:59:18 +00:00

1 2

73 Commits