The goal of this patch is to maximize CPU utilization on multi-socket or high core count systems, so that parallel computations such as LLD/ThinLTO can use all hardware threads in the system. Before this patch, on Windows, a maximum of 64 hardware threads could be used at most, in some cases dispatched only on one CPU socket.
== Background ==
Windows doesn't have a flat cpu_set_t like Linux. Instead, it projects hardware CPUs (or NUMA nodes) to applications through a concept of "processor groups". A "processor" is the smallest unit of execution on a CPU, that is, an hyper-thread if SMT is active; a core otherwise. There's a limit of 32-bit processors on older 32-bit versions of Windows, which later was raised to 64-processors with 64-bit versions of Windows. This limit comes from the affinity mask, which historically is represented by the sizeof(void*). Consequently, the concept of "processor groups" was introduced for dealing with systems with more than 64 hyper-threads.
By default, the Windows OS assigns only one "processor group" to each starting application, in a round-robin manner. If the application wants to use more processors, it needs to programmatically enable it, by assigning threads to other "processor groups". This also means that affinity cannot cross "processor group" boundaries; one can only specify a "preferred" group on start-up, but the application is free to allocate more groups if it wants to.
This creates a peculiar situation, where newer CPUs like the AMD EPYC 7702P (64-cores, 128-hyperthreads) are projected by the OS as two (2) "processor groups". This means that by default, an application can only use half of the cores. This situation could only get worse in the years to come, as dies with more cores will appear on the market.
== The problem ==
The heavyweight_hardware_concurrency() API was introduced so that only *one hardware thread per core* was used. Once that API returns, that original intention is lost, only the number of threads is retained. Consider a situation, on Windows, where the system has 2 CPU sockets, 18 cores each, each core having 2 hyper-threads, for a total of 72 hyper-threads. Both heavyweight_hardware_concurrency() and hardware_concurrency() currently return 36, because on Windows they are simply wrappers over std:🧵:hardware_concurrency() -- which can only return processors from the current "processor group".
== The changes in this patch ==
To solve this situation, we capture (and retain) the initial intention until the point of usage, through a new ThreadPoolStrategy class. The number of threads to use is deferred as late as possible, until the moment where the std::threads are created (ThreadPool in the case of ThinLTO).
When using hardware_concurrency(), setting ThreadCount to 0 now means to use all the possible hardware CPU (SMT) threads. Providing a ThreadCount above to the maximum number of threads will have no effect, the maximum will be used instead.
The heavyweight_hardware_concurrency() is similar to hardware_concurrency(), except that only one thread per hardware *core* will be used.
When LLVM_ENABLE_THREADS is OFF, the threading APIs will always return 1, to ensure any caller loops will be exercised at least once.
Differential Revision: https://reviews.llvm.org/D71775
This change has two components. The moves the generated file
for a namespace to the directory named after the namespace in
a file named 'index.<format>'. This greatly improves the browsing
experience since the index page is shown by default for a directory.
The second improves the markdown output by adding the links to the
referenced pages for children objects and the link back to the source
code.
Patch By: Clayton
Differential Revision: https://reviews.llvm.org/D72954
This change has two components. The moves the generated file
for a namespace to the directory named after the namespace in
a file named 'index.<format>'. This greatly improves the browsing
experience since the index page is shown by default for a directory.
The second improves the markdown output by adding the links to the
referenced pages for children objects and the link back to the source
code.
Patch By: Clayton
Differential Revision: https://reviews.llvm.org/D72954
This change has two components. The moves the generated file
for a namespace to the directory named after the namespace in
a file named 'index.<format>'. This greatly improves the browsing
experience since the index page is shown by default for a directory.
The second improves the markdown output by adding the links to the
referenced pages for children objects and the link back to the source
code.
Patch By: Clayton
Differential Revision: https://reviews.llvm.org/D72954
This is how it should've been and brings it more in line with
std::string_view. There should be no functional change here.
This is mostly mechanical from a custom clang-tidy check, with a lot of
manual fixups. It uncovers a lot of minor inefficiencies.
This doesn't actually modify StringRef yet, I'll do that in a follow-up.
This reverts commit 3f76260dc0.
Breaks at least these tests on Windows:
Clang :: Driver/clang-offload-bundler.c
Clang :: Driver/clang-offload-wrapper.c
Use clang_target_link_libraries() in order to support linking against
libclang-cpp instead of static libraries.
Differential Revision: https://reviews.llvm.org/D68448
llvm-svn: 373786
"Bad block found.\n" -> "bad block found"
The lower cased form with no full stop or newline is more common in LLVM
tools.
Reviewed By: juliehockett
Differential Revision: https://reviews.llvm.org/D66783
llvm-svn: 370155
The new design includes a header (contains the project name), a main section, and a footer.
The main section is divided into three subsections. Left, middle, right. The left section contains the general index, the middle contains the info's data, and the right contains the index for the info's content.
The CSS has been updated.
A flag --project-name is added.
The Attributes attribute of the TagNode struct is now a vector of pairs because these attributes should be rendered in the insertion order.
The functions (cpp and js) that converts an Index tree structure into HTML were slightly modified; the first ul tag created is now a ol tag. The inner lists are still ul.
Differential Revision: https://reviews.llvm.org/D66353
llvm-svn: 369139
The value, if any, of --source-root flag was not being used.
This has been fixed and the logic was moved to the ClangDocContext
contructor.
Differential revision: https://reviews.llvm.org/D66268
llvm-svn: 369065
Two command line options have been added to clang-doc.
--repository=<string> - URL of repository that hosts code; used for links to definition locations.
--source-root=<string> - Directory where processed files are stored. Links to definition locations will only be generated if the file is in this dir.
If the file is in the source-root and a repository options is passed;
a link to the source code will be rendered by the HTML generator.
Differential Revision: https://reviews.llvm.org/D65483
llvm-svn: 368460
Idx in ClangDocContext instance was being modified by multiple threads
causing a seg fault.
A mutex is added to avoid this.
Differential Revision: https://reviews.llvm.org/D65915
llvm-svn: 368313
Reduce phase has been parallelized and a execution time was reduced by
60% with this.
The reading of bitcode (bitcode -> Info) was moved to this segment of
code parallelized so it now happens just before reducing.
Differential Revision: https://reviews.llvm.org/D65628
llvm-svn: 368206
An index structure is created while generating the output file for each
info. This structure is parsed to JSON and written to a file in the
output directory. The html for the index is not rendered by clang-doc. A
Javascript file is included in the output directory, this will the JSON
file and insert HTML elements into the file.
Differential Revision: https://reviews.llvm.org/D65690
llvm-svn: 368070
The tool used to stop execution if there was an error in the mapping
phase. It will now show the error but continue with the files that were
mapped correctly if the flag is true.
Differential revision: https://reviews.llvm.org/D65627
llvm-svn: 367729
An option has been added to clang-doc to provide a list of css stylesheets that the user wants to use for the generated html docs.
Depends on D64539.
Differential Revision: https://reviews.llvm.org/D64938
llvm-svn: 367072
<a> tags are added for the parents and members of records and return type and
params of functions. The link redirects to the reference's info file.
The directory path where each info file will be saved is now generated in the
serialization phase and stored as an attribute in each Info.
Bitcode writer and reader were modified to handle the new attributes.
Committed on behalf of Diego Astiazarán (diegoaat97@gmail.com).
Differential Revision: https://reviews.llvm.org/D63663
llvm-svn: 365937
Implements an HTML generator.
Nodes are used to represent each part of the HTML file. There are TagNodes that
represent every HTML tag (p, h1, div, ...) and they have children nodes, which
can be TagNodes or TextNodes (these nodes only have text).
Proper indentation is rendered within the files generated by tool.
No styling (CSS) is included.
Committed on behalf of Diego Astiazarán (diegoaat97@gmail.com)
Differential Revision: https://reviews.llvm.org/D63857
llvm-svn: 365687
Improves output for anonymous decls, and updates the '--public' flag to exclude everything under an anonymous namespace.
Differential Revision: https://reviews.llvm.org/D52847
llvm-svn: 364674
to reflect the new license.
We understand that people may be surprised that we're moving the header
entirely to discuss the new license. We checked this carefully with the
Foundation's lawyer and we believe this is the correct approach.
Essentially, all code in the project is now made available by the LLVM
project under our new license, so you will see that the license headers
include that license only. Some of our contributors have contributed
code under our old license, and accordingly, we have retained a copy of
our old license notice in the top-level files in each project and
repository.
llvm-svn: 351636
Now that the clang-doc libraries are covered by unit tests, we don't
need to have extensive (and unmaintainable) integration tests. This
replaces the integration test suite with a smaller one that just tests
the tool itself and removes extraneous dumping logic from the tool
itself.
Includes tests that cover the parse->serialize->merge->generate
pipeline, as well as tests for the --public, --format, --doxygen, and
--output flags.
Differential Revision: https://reviews.llvm.org/D53150
llvm-svn: 344655
Implementing a simple Markdown generator from the emitted bitcode
summary of declarations. Very primitive at this point, but will be
expanded. Currently emits an .md file for each class and namespace,
listing its contents.
For a more detailed overview of the tool, see the design document
on the mailing list:
http://lists.llvm.org/pipermail/cfe-dev/2017-December/056203.html
Differential Revision: https://reviews.llvm.org/D43424
llvm-svn: 339948
Relanding with a minor change to prevent an assertion on release bots.
The result of this adjusted mapper pass is that all Function and Enum
infos are absorbed into the info of their enclosing scope (i.e. the class
or namespace in which they are defined). Namespace and Record infos are
passed along to the final output, but the second pass creates a reference
to each in its parent scope. As a result, the top-level final outputs are
Namespaces and Records.
Differential Revision: https://reviews.llvm.org/D48341
llvm-svn: 338763
The result of this adjusted mapper pass is that all Function and Enum
infos are absorbed into the info of their enclosing scope (i.e. the
class or namespace in which they are defined). Namespace and Record
infos are passed along to the final output, but the second pass creates
a reference to each in its parent scope. As a result, the top-level final
outputs are Namespaces and Records.
llvm-svn: 338738
Submitted on behalf of Annie Cherkaev (@anniecherk)
Added a flag which, when enabled, documents only those methods and
fields which have a Public attribute.
Differential Revision: https://reviews.llvm.org/D48395
llvm-svn: 337602
Implmenting a YAML generator from the emitted bitcode summary of
declarations. Emits one YAML file for each declaration information.
For a more detailed overview of the tool, see the design document on the mailing list: http://lists.llvm.org/pipermail/cfe-dev/2017-December/056203.html
llvm-svn: 334103
Implements a simple, in-memory reducer for the mapped output of the
initial tool. This creates a collection object for storing the
deduplicated infos on each declaration, and populates that from the
mapper output. The collection object is serialized to LLVM
bitstream. On reading each serialized output, it checks to see if a
merge is necessary and if so, merges the new info with the existing
info (prefering the existing one if conflicts exist).
For a more detailed overview of the tool, see the design document
on the mailing list:
http://lists.llvm.org/pipermail/cfe-dev/2017-December/056203.html
Differential Revision: https://reviews.llvm.org/D43341
llvm-svn: 333932
Setting up the mapper part of the frontend framework for a clang-doc
tool. It creates a series of relevant matchers for declarations, and
uses the ToolExecutor to traverse the AST and extract the matching
declarations and comments. The mapper serializes the extracted
information to individual records for reducing and eventually doc
generation.
For a more detailed overview of the tool, see the design document on the
mailing list: http://lists.llvm.org/pipermail/cfe-dev/2017-December/056203.html
Differential Revision: https://reviews.llvm.org/D41102
llvm-svn: 327102