llvm-project

Commit Graph

Author	SHA1	Message	Date
Mehdi Amini	b1aaed023e	Enable `Pass::initialize()` to fail by returning a LogicalResult Differential Revision: https://reviews.llvm.org/D96474	2021-02-11 01:51:53 +00:00
Alex Zinenko	2996a8d675	[mlir] avoid exposing mutable DialectRegistry from MLIRContext MLIRContext allows its users to access directly to the DialectRegistry it contains. While sometimes useful for registering additional dialects on an already existing context, this breaks the encapsulation by essentially giving raw accesses to a part of the context's internal state. Remove this mutable access and instead provide a method to append a given DialectRegistry to the one already contained in the context. Also provide a shortcut mechanism to construct a context from an already existing registry, which seems to be a common use case in the wild. Keep read-only access to the registry contained in the context in case it needs to be copied or used for constructing another context. With this change, DialectRegistry is no longer concerned with loading the dialects and deciding whether to invoke delayed interface registration. Loading is concentrated in the MLIRContext, and the functionality of the registry better reflects its name. Depends On D96137 Reviewed By: mehdi_amini Differential Revision: https://reviews.llvm.org/D96331	2021-02-10 12:07:34 +01:00
River Riddle	fe7c0d90b2	[mlir][IR] Remove the concept of `OperationProperties` These properties were useful for a few things before traits had a better integration story, but don't really carry their weight well these days. Most of these properties are already checked via traits in most of the code. It is better to align the system around traits, and improve the performance/cost of traits in general. Differential Revision: https://reviews.llvm.org/D96088	2021-02-09 12:00:15 -08:00
River Riddle	e21adfa32d	[mlir] Mark LogicalResult as LLVM_NODISCARD This makes ignoring a result explicit by the user, and helps to prevent accidental errors with dropped results. Marking LogicalResult as no discard was always the intention from the beginning, but got lost along the way. Differential Revision: https://reviews.llvm.org/D95841	2021-02-04 15:10:10 -08:00
River Riddle	02bc4c95f0	[mlir][PassManager] Only reinitialize the pass manager if the context registry changes This prevents needless reinitialization for clients that want to reuse a pass manager multiple times. A new `getRegisryHash` function is exposed by the context to give a rough indicator of when the context registry has changed. Differential Revision: https://reviews.llvm.org/D95493	2021-01-27 17:41:51 -08:00
Jacques Pienaar	aee622fa20	[mlir] Enable passing crash reproducer stream factory method Add factory to create streams for logging the reproducer. Allows for more general logging (beyond file) and logging the configuration/module separately (logged in order, configuration before module). Also enable querying filename of ToolOutputFile. Differential Revision: https://reviews.llvm.org/D94868	2021-01-21 20:03:15 -08:00
River Riddle	77501bd175	[mlir][PassManager] Properly set the initialization generation when cloning a pass manager Fixes a bug where dynamic pass pipelines of cloned pass managers weren't being initialized properly.	2021-01-08 14:41:29 -08:00
River Riddle	1ba5ea67a3	[mlir] Add a hook for initializing passes before execution and use it in the Canonicalizer This revision adds a new `initialize(MLIRContext *)` hook to passes that allows for them to initialize any heavy state before the first execution of the pass. A concrete use case of this is with patterns that rely on PDL, given that PDL is compiled at run time it is imperative that compilation results are cached as much as possible. The first use of this hook is in the Canonicalizer, which has the added benefit of reducing the number of expensive accesses to the context when collecting patterns. Differential Revision: https://reviews.llvm.org/D93147	2021-01-08 13:36:12 -08:00
River Riddle	d7eba20052	[mlir][Inliner] Refactor the inliner to use nested pass pipelines instead of just canonicalization Now that passes have support for running nested pipelines, the inliner can now allow for users to provide proper nested pipelines to use for optimization during inlining. This revision also changes the behavior of optimization during inlining to optimize before attempting to inline, which should lead to a more accurate cost model and prevents the need for users to schedule additional duplicate cleanup passes before/after the inliner that would already be run during inlining. Differential Revision: https://reviews.llvm.org/D91211	2020-12-14 18:09:47 -08:00
Jacques Pienaar	9c3fa3d84d	Don't emit on op diagnostic in reproducer emission This avoids dumping the module post emitting a reproducer, which results in many MB logs where a reproducer has already been neatly generated. Differential Revision: https://reviews.llvm.org/D93165	2020-12-13 07:21:32 -08:00
River Riddle	00c6ef8628	[mlir][Pass] Remove the restriction that PassManager can only run on ModuleOp This was a somewhat important restriction in the past when ModuleOp was distinctly the top-level container operation, as well as before the pass manager had support for running nested pass managers natively. With these two issues fading away, there isn't really a good reason to enforce that a ModuleOp is the thing running within a pass manager. As such, this revision removes the restriction and allows for users to pass in the name of the operation that the pass manager will be scheduled on. The only remaining dependency on BuiltinOps from Pass after this revision is due to FunctionPass, which will be resolved in a followup revision. Differential Revision: https://reviews.llvm.org/D92450	2020-12-03 15:47:01 -08:00
River Riddle	65fcddff24	[mlir][BuiltinDialect] Resolve comments from D91571 * Move ops to a BuiltinOps.h * Add file comments	2020-11-19 11:12:49 -08:00
River Riddle	bd106d7469	[mlir][Pass] Only enable/disable CrashRecovery once This prevents potential problems that occur when multiple pass managers register crash recovery contexts.	2020-11-18 18:50:18 -08:00
River Riddle	73ca690df8	[mlir][NFC] Remove references to Module.h and Function.h These includes have been deprecated in favor of BuiltinDialect.h, which contains the definitions of ModuleOp and FuncOp. Differential Revision: https://reviews.llvm.org/D91572	2020-11-17 00:55:47 -08:00
River Riddle	811001380f	[mlir][Pass] Remove the verifierPass now that verification is run during normal pass execution A recent refactoring removed the need to interleave verifier passes and instead opted to verify during the normal execution of passes instead. As such, the old verify pass is no longer necessary and can be removed. Differential Revision: https://reviews.llvm.org/D91212	2020-11-12 23:45:27 -08:00
Mehdi Amini	a62d38a90d	Disable implicit nesting on parsing textual pass pipeline Previous the textual form of the pass pipeline would implicitly nest, instead we opt for the explicit form here: this has less surprise. This also avoids asserting in the bindings when passing a pass pipeline with incorrect nesting. Differential Revision: https://reviews.llvm.org/D91233	2020-11-11 19:21:51 +00:00
Mehdi Amini	4455f3ce72	Capture the name for mlir::OpPassManager in std::string instead of StringRef (NFC) The previous behavior was fragile when building an OpPassManager using a string, as it was forcing the client to ensure the string to outlive the entire PassManager. This isn't a performance sensitive area either that would justify optimizing further.	2020-11-05 05:28:44 +00:00
Mehdi Amini	008b9d97cb	Make the implicit nesting behavior of the PassManager user-controllable and default to false This is an error prone behavior, I frequently have ~20 min debugging sessions when I hit an unexpected implicit nesting. This default makes the C++ API safer for users. Depends On D90669 Reviewed By: rriddle Differential Revision: https://reviews.llvm.org/D90671	2020-11-03 11:17:44 +00:00
Mehdi Amini	cd7107a62b	Handle the verifier at run() time in the PassManager instead of build time This simplifies a few parts of the pass manager, but in particular we don't add as many verifierpass as there are passes in the pipeline, and we can now enable/disable the verifier after the fact on an already built PassManager. Reviewed By: rriddle Differential Revision: https://reviews.llvm.org/D90669	2020-11-03 11:17:14 +00:00
Mehdi Amini	fe3c1195cf	Add a dump() method on the pass manager for debugging purpose (NFC) Reviewed By: ftynse Differential Revision: https://reviews.llvm.org/D88008	2020-09-23 05:53:41 +00:00
Mehdi Amini	fb1de7ed92	Implement a new kind of Pass: dynamic pass pipeline Instead of performing a transformation, such pass yields a new pass pipeline to run on the currently visited operation. This feature can be used for example to implement a sub-pipeline that would run only on an operation with specific attributes. Another example would be to compute a cost model and dynamic schedule a pipeline based on the result of this analysis. Discussion: https://llvm.discourse.group/t/rfc-dynamic-pass-pipeline/1637 Recommit after fixing an ASAN issue: the callback lambda needs to be allocated to a temporary to have its lifetime extended to the end of the current block instead of just the current call expression. Reviewed By: silvas Differential Revision: https://reviews.llvm.org/D86392	2020-09-22 18:51:54 +00:00
Thomas Joerg	0356a413a4	Revert "Implement a new kind of Pass: dynamic pass pipeline" This reverts commit `385c3f43fc`. Test mlir/test/Pass:dynamic-pipeline-fail-on-parent.mlir.test fails when run with ASAN: ERROR: AddressSanitizer: stack-use-after-scope on address ... Reviewed By: bkramer, pifon2a Differential Revision: https://reviews.llvm.org/D88079	2020-09-22 12:00:30 +02:00
Mehdi Amini	385c3f43fc	Implement a new kind of Pass: dynamic pass pipeline Instead of performing a transformation, such pass yields a new pass pipeline to run on the currently visited operation. This feature can be used for example to implement a sub-pipeline that would run only on an operation with specific attributes. Another example would be to compute a cost model and dynamic schedule a pipeline based on the result of this analysis. Discussion: https://llvm.discourse.group/t/rfc-dynamic-pass-pipeline/1637 Reviewed By: silvas Differential Revision: https://reviews.llvm.org/D86392	2020-09-22 01:24:25 +00:00
Mehdi Amini	702f06ad14	Fix crash in the pass pipeline when local reproducer is enabled This crash only happens when a function pass is followed by a module pass. In this case the splitting of the pass pipeline didn't handle properly the verifier passes and ended up with an odd number of pass in the pipeline, breaking an assumption of the local crash reproducer executor and hitting an assertion. Differential Revision: https://reviews.llvm.org/D88000	2020-09-21 08:52:50 +00:00
Mehdi Amini	c0b6bc070e	Decouple OpPassManager from the the MLIRContext (NFC) This is allowing to build an OpPassManager from a StringRef instead of an Identifier, which enables building pipelines without an MLIRContext. An identifier is still cached on-demand on the OpPassManager for efficiency during the IR traversal.	2020-09-03 06:02:05 +00:00
Mehdi Amini	1284dc34ab	Use an Identifier instead of an OperationName internally for OpPassManager identification (NFC) This allows to defers the check for traits to the execution instead of forcing it on the pipeline creation. In particular, this is making our pipeline creation tolerant to dialects not being loaded in the context yet. Reviewed By: rriddle, GMNGeoffrey Differential Revision: https://reviews.llvm.org/D86915	2020-09-02 21:46:05 +00:00
Mehdi Amini	c39c21610d	Rename AnalysisManager::slice in AnalysisManager::nest (NFC) The naming wasn't reflecting the intent of this API, "nest" is aligning it with the pass manager API.	2020-08-28 20:41:07 +00:00
Mehdi Amini	6c05ca21b9	Remove the `run` method from `OpPassManager` and `Pass` and migrate it to `OpToOpPassAdaptor` This makes OpPassManager more of a "container" of passes and not responsible to drive the execution. As such we also make it constructible publicly, which will allow to build arbitrary pipeline decoupled from the execution. We'll make use of this facility to expose "dynamic pipeline" in the future. Reviewed By: rriddle Differential Revision: https://reviews.llvm.org/D86391	2020-08-27 04:57:29 +00:00
Mehdi Amini	610706906a	Add an assertion to protect against missing Dialect registration in a pass pipeline (NFC) Reviewed By: rriddle Differential Revision: https://reviews.llvm.org/D86327	2020-08-24 06:49:29 +00:00
Mehdi Amini	f9dc2b7079	Separate the Registration from Loading dialects in the Context This changes the behavior of constructing MLIRContext to no longer load globally registered dialects on construction. Instead Dialects are only loaded explicitly on demand: - the Parser is lazily loading Dialects in the context as it encounters them during parsing. This is the only purpose for registering dialects and not load them in the context. - Passes are expected to declare the dialects they will create entity from (Operations, Attributes, or Types), and the PassManager is loading Dialects into the Context when starting a pipeline. This changes simplifies the configuration of the registration: a compiler only need to load the dialect for the IR it will emit, and the optimizer is self-contained and load the required Dialects. For example in the Toy tutorial, the compiler only needs to load the Toy dialect in the Context, all the others (linalg, affine, std, LLVM, ...) are automatically loaded depending on the optimization pipeline enabled. To adjust to this change, stop using the existing dialect registration: the global registry will be removed soon. 1) For passes, you need to override the method: virtual void getDependentDialects(DialectRegistry &registry) const {} and registery on the provided registry any dialect that this pass can produce. Passes defined in TableGen can provide this list in the dependentDialects list field. 2) For dialects, on construction you can register dependent dialects using the provided MLIRContext: `context.getOrLoadDialect<DialectName>()` This is useful if a dialect may canonicalize or have interfaces involving another dialect. 3) For loading IR, dialect that can be in the input file must be explicitly registered with the context. `MlirOptMain()` is taking an explicit registry for this purpose. See how the standalone-opt.cpp example is setup: mlir::DialectRegistry registry; registry.insert<mlir::standalone::StandaloneDialect>(); registry.insert<mlir::StandardOpsDialect>(); Only operations from these two dialects can be in the input file. To include all of the dialects in MLIR Core, you can populate the registry this way: mlir::registerAllDialects(registry); 4) For `mlir-translate` callback, as well as frontend, Dialects can be loaded in the context before emitting the IR: context.getOrLoadDialect<ToyDialect>() Differential Revision: https://reviews.llvm.org/D85622	2020-08-19 01:19:03 +00:00
Mehdi Amini	e75bc5c791	Revert "Separate the Registration from Loading dialects in the Context" This reverts commit `d14cf45735`. The build is broken with GCC-5.	2020-08-19 01:19:03 +00:00
Mehdi Amini	d14cf45735	Separate the Registration from Loading dialects in the Context This changes the behavior of constructing MLIRContext to no longer load globally registered dialects on construction. Instead Dialects are only loaded explicitly on demand: - the Parser is lazily loading Dialects in the context as it encounters them during parsing. This is the only purpose for registering dialects and not load them in the context. - Passes are expected to declare the dialects they will create entity from (Operations, Attributes, or Types), and the PassManager is loading Dialects into the Context when starting a pipeline. This changes simplifies the configuration of the registration: a compiler only need to load the dialect for the IR it will emit, and the optimizer is self-contained and load the required Dialects. For example in the Toy tutorial, the compiler only needs to load the Toy dialect in the Context, all the others (linalg, affine, std, LLVM, ...) are automatically loaded depending on the optimization pipeline enabled. To adjust to this change, stop using the existing dialect registration: the global registry will be removed soon. 1) For passes, you need to override the method: virtual void getDependentDialects(DialectRegistry &registry) const {} and registery on the provided registry any dialect that this pass can produce. Passes defined in TableGen can provide this list in the dependentDialects list field. 2) For dialects, on construction you can register dependent dialects using the provided MLIRContext: `context.getOrLoadDialect<DialectName>()` This is useful if a dialect may canonicalize or have interfaces involving another dialect. 3) For loading IR, dialect that can be in the input file must be explicitly registered with the context. `MlirOptMain()` is taking an explicit registry for this purpose. See how the standalone-opt.cpp example is setup: mlir::DialectRegistry registry; registry.insert<mlir::standalone::StandaloneDialect>(); registry.insert<mlir::StandardOpsDialect>(); Only operations from these two dialects can be in the input file. To include all of the dialects in MLIR Core, you can populate the registry this way: mlir::registerAllDialects(registry); 4) For `mlir-translate` callback, as well as frontend, Dialects can be loaded in the context before emitting the IR: context.getOrLoadDialect<ToyDialect>() Differential Revision: https://reviews.llvm.org/D85622	2020-08-18 23:23:56 +00:00
Mehdi Amini	d84fe55e0d	Revert "Separate the Registration from Loading dialects in the Context" This reverts commit `e1de2b7550`. Broke a build bot.	2020-08-18 22:16:34 +00:00
Mehdi Amini	e1de2b7550	Separate the Registration from Loading dialects in the Context This changes the behavior of constructing MLIRContext to no longer load globally registered dialects on construction. Instead Dialects are only loaded explicitly on demand: - the Parser is lazily loading Dialects in the context as it encounters them during parsing. This is the only purpose for registering dialects and not load them in the context. - Passes are expected to declare the dialects they will create entity from (Operations, Attributes, or Types), and the PassManager is loading Dialects into the Context when starting a pipeline. This changes simplifies the configuration of the registration: a compiler only need to load the dialect for the IR it will emit, and the optimizer is self-contained and load the required Dialects. For example in the Toy tutorial, the compiler only needs to load the Toy dialect in the Context, all the others (linalg, affine, std, LLVM, ...) are automatically loaded depending on the optimization pipeline enabled. To adjust to this change, stop using the existing dialect registration: the global registry will be removed soon. 1) For passes, you need to override the method: virtual void getDependentDialects(DialectRegistry &registry) const {} and registery on the provided registry any dialect that this pass can produce. Passes defined in TableGen can provide this list in the dependentDialects list field. 2) For dialects, on construction you can register dependent dialects using the provided MLIRContext: `context.getOrLoadDialect<DialectName>()` This is useful if a dialect may canonicalize or have interfaces involving another dialect. 3) For loading IR, dialect that can be in the input file must be explicitly registered with the context. `MlirOptMain()` is taking an explicit registry for this purpose. See how the standalone-opt.cpp example is setup: mlir::DialectRegistry registry; mlir::registerDialect<mlir::standalone::StandaloneDialect>(); mlir::registerDialect<mlir::StandardOpsDialect>(); Only operations from these two dialects can be in the input file. To include all of the dialects in MLIR Core, you can populate the registry this way: mlir::registerAllDialects(registry); 4) For `mlir-translate` callback, as well as frontend, Dialects can be loaded in the context before emitting the IR: context.getOrLoadDialect<ToyDialect>()	2020-08-18 21:14:39 +00:00
Mehdi Amini	25ee851746	Revert "Separate the Registration from Loading dialects in the Context" This reverts commit `2056393387`. Build is broken on a few bots	2020-08-15 09:21:47 +00:00
Mehdi Amini	2056393387	Separate the Registration from Loading dialects in the Context This changes the behavior of constructing MLIRContext to no longer load globally registered dialects on construction. Instead Dialects are only loaded explicitly on demand: - the Parser is lazily loading Dialects in the context as it encounters them during parsing. This is the only purpose for registering dialects and not load them in the context. - Passes are expected to declare the dialects they will create entity from (Operations, Attributes, or Types), and the PassManager is loading Dialects into the Context when starting a pipeline. This changes simplifies the configuration of the registration: a compiler only need to load the dialect for the IR it will emit, and the optimizer is self-contained and load the required Dialects. For example in the Toy tutorial, the compiler only needs to load the Toy dialect in the Context, all the others (linalg, affine, std, LLVM, ...) are automatically loaded depending on the optimization pipeline enabled. Differential Revision: https://reviews.llvm.org/D85622	2020-08-15 08:07:31 +00:00
Mehdi Amini	ba92dadf05	Revert "Separate the Registration from Loading dialects in the Context" This was landed by accident, will reland with the right comments addressed from the reviews. Also revert dependent build fixes.	2020-08-15 07:35:10 +00:00
Mehdi Amini	ebf521e784	Separate the Registration from Loading dialects in the Context This changes the behavior of constructing MLIRContext to no longer load globally registered dialects on construction. Instead Dialects are only loaded explicitly on demand: - the Parser is lazily loading Dialects in the context as it encounters them during parsing. This is the only purpose for registering dialects and not load them in the context. - Passes are expected to declare the dialects they will create entity from (Operations, Attributes, or Types), and the PassManager is loading Dialects into the Context when starting a pipeline. This changes simplifies the configuration of the registration: a compiler only need to load the dialect for the IR it will emit, and the optimizer is self-contained and load the required Dialects. For example in the Toy tutorial, the compiler only needs to load the Toy dialect in the Context, all the others (linalg, affine, std, LLVM, ...) are automatically loaded depending on the optimization pipeline enabled.	2020-08-14 09:40:27 +00:00
Reid Kleckner	932f0276ea	[Support] Move LLD's parallel algorithm wrappers to support Essentially takes the lld/Common/Threads.h wrappers and moves them to the llvm/Support/Paralle.h algorithm header. The changes are: - Remove policy parameter, since all clients use `par`. - Rename the methods to `parallelSort` etc to match LLVM style, since they are no longer C++17 pstl compatible. - Move algorithms from llvm::parallel:: to llvm::, since they have "parallel" in the name and are no longer overloads of the regular algorithms. - Add range overloads - Use the sequential algorithm directly when 1 thread is requested (skips task grouping) - Fix the index type of parallelForEachN to size_t. Nobody in LLVM was using any other parameter, and it made overload resolution hard for for_each_n(par, 0, foo.size(), ...) because 0 is int, not size_t. Remove Threads.h and update LLD for that. This is a prerequisite for parallel public symbol processing in the PDB library, which is in LLVM. Reviewed By: MaskRay, aganea Differential Revision: https://reviews.llvm.org/D79390	2020-05-05 15:21:05 -07:00
River Riddle	cb9ae0025c	[mlir] Add a new context flag for disabling/enabling multi-threading This is useful for several reasons: * In some situations the user can guarantee that thread-safety isn't necessary and don't want to pay the cost of synchronization, e.g., when parsing a very large module. * For things like logging threading is not desirable as the output is not guaranteed to be in stable order. This flag also subsumes the pass manager flag for multi-threading. Differential Revision: https://reviews.llvm.org/D79266	2020-05-02 12:32:25 -07:00
Stephen Neuendorffer	57818885be	[MLIR] Move Verifier and Dominance Analysis from /Analysis to /IR These libraries are distinct from other things in Analysis in that they operate only on core IR concepts. This also simplifies dependencies so that Dialect -> Analysis -> Parser -> IR. Previously, the parser depended on portions of the the Analysis directory as well, which sometimes caused issues with the way the cmake makefile generator discovers dependencies on generated files during compilation. Differential Revision: https://reviews.llvm.org/D79240	2020-05-01 20:01:46 -07:00
River Riddle	e62ff42f79	[mlir][Pass] Register a signal handler when generating crash reproducers. The current implementation uses CrashRecoveryContext, but this only supports recovering in a certain number of cases. This revision adds a signal handler to support even more situations. This revision was able to properly generate a reproducer for a segfault in the Inliner, that the current recovery couldn't. Differential Revision: https://reviews.llvm.org/D78315	2020-04-29 15:23:10 -07:00
River Riddle	983382f134	[mlir][Pass] Add support for generating local crash reproducers This revision adds a mode to the crash reproducer generator to attempt to generate a more local reproducer. This will attempt to generate a reproducer right before the offending pass that fails. This is useful for the majority of failures that are specific to a single pass, and situations where some passes in the pipeline are not registered with a specific tool. Differential Revision: https://reviews.llvm.org/D78314	2020-04-29 15:23:10 -07:00
River Riddle	56a698510f	[mlir][Pass][NFC] Merge OpToOpPassAdaptor and OpToOpPassAdaptorParallel This moves the threading check to runOnOperation. This produces a much cleaner interface for the adaptor pass, and will allow for the ability to enable/disable threading in a much cleaner way in the future. Differential Revision: https://reviews.llvm.org/D78313	2020-04-29 15:23:10 -07:00
River Riddle	2f21a57966	[llvm][STLExtras] Move the algorithm `interleave*` methods from MLIR to LLVM These have proved incredibly useful for interleaving values between a range w.r.t to streams. After this revision, the mlir/Support/STLExtras.h is empty. A followup revision will remove it from the tree. Differential Revision: https://reviews.llvm.org/D78067	2020-04-14 15:14:40 -07:00
River Riddle	a517191a47	[mlir][NFC] Refactor ClassID into a TypeID class. Summary: ClassID is a bit janky right now as it involves passing a magic pointer around. This revision hides the internal implementation mechanism within a new class TypeID. This class is a value-typed wrapper around the original ClassID implementation. Differential Revision: https://reviews.llvm.org/D77768	2020-04-10 23:52:33 -07:00
River Riddle	7824768b2e	[mlir][Pass] Add a new `Pass::getArgument` hook Summary: This hook allows for passes to specify the command line argument without the need for registration. More concretely this will allow for generating pass crash reproducers without needing to have the passes registered. This should remove the need for production tools to register passes, leaving that solely to development tools like mlir-opt. Differential Revision: https://reviews.llvm.org/D77907	2020-04-10 22:50:14 -07:00
Mehdi Amini	07aa9ae23b	Ensure that multi-threading is disabled when enabling IRPrinting with module scope This is avoid the user to shoot themselves in the foot and encounter strange crashes that are confusing until one run with TSAN. Differential Revision: https://reviews.llvm.org/D75399	2020-02-29 18:28:54 +00:00
Alexandre Ganea	8404aeb56a	[Support] On Windows, ensure hardware_concurrency() extends to all CPU sockets and all NUMA groups The goal of this patch is to maximize CPU utilization on multi-socket or high core count systems, so that parallel computations such as LLD/ThinLTO can use all hardware threads in the system. Before this patch, on Windows, a maximum of 64 hardware threads could be used at most, in some cases dispatched only on one CPU socket. == Background == Windows doesn't have a flat cpu_set_t like Linux. Instead, it projects hardware CPUs (or NUMA nodes) to applications through a concept of "processor groups". A "processor" is the smallest unit of execution on a CPU, that is, an hyper-thread if SMT is active; a core otherwise. There's a limit of 32-bit processors on older 32-bit versions of Windows, which later was raised to 64-processors with 64-bit versions of Windows. This limit comes from the affinity mask, which historically is represented by the sizeof(void). Consequently, the concept of "processor groups" was introduced for dealing with systems with more than 64 hyper-threads. By default, the Windows OS assigns only one "processor group" to each starting application, in a round-robin manner. If the application wants to use more processors, it needs to programmatically enable it, by assigning threads to other "processor groups". This also means that affinity cannot cross "processor group" boundaries; one can only specify a "preferred" group on start-up, but the application is free to allocate more groups if it wants to. This creates a peculiar situation, where newer CPUs like the AMD EPYC 7702P (64-cores, 128-hyperthreads) are projected by the OS as two (2) "processor groups". This means that by default, an application can only use half of the cores. This situation could only get worse in the years to come, as dies with more cores will appear on the market. == The problem == The heavyweight_hardware_concurrency() API was introduced so that only one hardware thread per core* was used. Once that API returns, that original intention is lost, only the number of threads is retained. Consider a situation, on Windows, where the system has 2 CPU sockets, 18 cores each, each core having 2 hyper-threads, for a total of 72 hyper-threads. Both heavyweight_hardware_concurrency() and hardware_concurrency() currently return 36, because on Windows they are simply wrappers over std:🧵:hardware_concurrency() -- which can only return processors from the current "processor group". == The changes in this patch == To solve this situation, we capture (and retain) the initial intention until the point of usage, through a new ThreadPoolStrategy class. The number of threads to use is deferred as late as possible, until the moment where the std::threads are created (ThreadPool in the case of ThinLTO). When using hardware_concurrency(), setting ThreadCount to 0 now means to use all the possible hardware CPU (SMT) threads. Providing a ThreadCount above to the maximum number of threads will have no effect, the maximum will be used instead. The heavyweight_hardware_concurrency() is similar to hardware_concurrency(), except that only one thread per hardware core will be used. When LLVM_ENABLE_THREADS is OFF, the threading APIs will always return 1, to ensure any caller loops will be exercised at least once. Differential Revision: https://reviews.llvm.org/D71775	2020-02-14 10:24:22 -05:00
Benjamin Kramer	adcd026838	Make llvm::StringRef to std::string conversions explicit. This is how it should've been and brings it more in line with std::string_view. There should be no functional change here. This is mostly mechanical from a custom clang-tidy check, with a lot of manual fixups. It uncovers a lot of minor inefficiencies. This doesn't actually modify StringRef yet, I'll do that in a follow-up.	2020-01-28 23:25:25 +01:00

1 2 3

115 Commits