llvm-project

Commit Graph

Author	SHA1	Message	Date
Andrea Di Biagio	373a4ccf6c	[llvm-mca][MC] Add the ability to declare which processor resources model load/store queues (PR36666). This patch adds the ability to specify via tablegen which processor resources are load/store queue resources. A new tablegen class named MemoryQueue can be optionally used to mark resources that model load/store queues. Information about the load/store queue is collected at 'CodeGenSchedule' stage, and analyzed by the 'SubtargetEmitter' to initialize two new fields in struct MCExtraProcessorInfo named `LoadQueueID` and `StoreQueueID`. Those two fields are identifiers for buffered resources used to describe the load queue and the store queue. Field `BufferSize` is interpreted as the number of entries in the queue, while the number of units is a throughput indicator (i.e. number of available pickers for loads/stores). At construction time, LSUnit in llvm-mca checks for the presence of extra processor information (i.e. MCExtraProcessorInfo) in the scheduling model. If that information is available, and fields LoadQueueID and StoreQueueID are set to a value different than zero (i.e. the invalid processor resource index), then LSUnit initializes its LoadQueue/StoreQueue based on the BufferSize value declared by the two processor resources. With this patch, we more accurately track dynamic dispatch stalls caused by the lack of LS tokens (i.e. load/store queue full). This is also shown by the differences in two BdVer2 tests. Stalls that were previously classified as generic SCHEDULER FULL stalls, are not correctly classified either as "load queue full" or "store queue full". About the differences in the -scheduler-stats view: those differences are expected, because entries in the load/store queue are not released at instruction issue stage. Instead, those are released at instruction executed stage. This is the main reason why for the modified tests, the load/store queues gets full before PdEx is full. Differential Revision: https://reviews.llvm.org/D54957 llvm-svn: 347857	2018-11-29 12:15:56 +00:00
Andrea Di Biagio	d10ed7c8d7	Reapply "[llvm-mca] Return the total number of cycles from method Pipeline::run()." This reapplies r347767 (originally reviewed at: https://reviews.llvm.org/D55000) with a fix for the missing std::move of the Error returned by the call to Pipeline::runCycle(). Below is the original commit message from r347767. If a user only cares about the overall latency, then the best/quickest way is to change method Pipeline::run() so that it returns the total number of cycles to the caller. When the simulation pipeline is run, the number of cycles (or an error) is returned from method Pipeline::run(). The advantage is that no hardware event listener is needed for computing that latency. So, the whole process should be faster (and simpler - at least for that particular use case). llvm-svn: 347795	2018-11-28 19:31:19 +00:00
Andrea Di Biagio	7368fe4207	Revert [llvm-mca] Return the total number of cycles from method Pipeline::run(). This reverts commits 347767. llvm-svn: 347775	2018-11-28 16:39:48 +00:00
Andrea Di Biagio	2a68a27010	[llvm-mca] Return the total number of cycles from method Pipeline::run(). If a user only cares about the overall latency, then the best/quickest way is to change method Pipeline::run() so that it returns the total number of cycles to the caller. When the simulation pipeline is run, the number of cycles (or an error) is returned from method Pipeline::run(). The advantage is that no hardware event listener is needed for computing that latency. So, the whole process should be faster (and simpler - at least for that particular use case). llvm-svn: 347767	2018-11-28 16:24:51 +00:00
Andrea Di Biagio	36296c0484	[llvm-mca] Add support for instructions with a variadic number of operands. By default, llvm-mca conservatively assumes that a register operand from the variadic sequence is both a register read and a register write. That is because MCInstrDesc doesn't describe extra variadic operands; we don't have enough dataflow information to tell which register operands from the variadic sequence is a definition, and which is a use instead. However, if a variadic instruction is flagged 'mayStore' (but not 'mayLoad'), and it has no 'unmodeledSideEffects', then llvm-mca (very) optimistically assumes that any register operand in the variadic sequence is a register read only. Conversely, if a variadic instruction is marked as 'mayLoad' (but not 'mayStore'), and it has no 'unmodeledSideEffects', then llvm-mca optimistically assumes that any extra register operand is a register definition only. These assumptions work quite well for variadic load/store multiple instructions defined by the ARM backend. llvm-svn: 347522	2018-11-25 12:46:24 +00:00
Andrea Di Biagio	42720603c4	[llvm-mca] InstrBuilder: warnings for call/ret instructions are only reported once. llvm-svn: 347514	2018-11-24 18:40:45 +00:00
Andrea Di Biagio	7e32cc8353	[llvm-mca] Refactor some of the logic in InstrBuilder, and add a verifyOperands method. With this change, InstrBuilder emits an error if the MCInst sequence contains an instruction with a variadic opcode, and a non-zero number of variadic operands. Currently we don't know how to correctly analyze variadic opcodes. The problem with variadic operands is that there is no information for them in the opcode descriptor (i.e. MCInstrDesc). That means, we don't know which variadic operands are defs, and which are uses. In future, we could try to conservatively assume that any extra register operands is both a register use and a register definition. This patch fixes a subtle bug in the evaluation of read/write operands for ARM VLD1 with implicit index update. Added test vld1-index-update.s llvm-svn: 347503	2018-11-23 20:26:57 +00:00
Andrea Di Biagio	840f032630	[llvm-mca] LSUnit: use a SmallSet to model load/store queues. NFCI Also, try to minimize the number of queries to the memory queues to speedup the analysis. On average, this change gives a small 2% speedup. For memcpy-like kernels, the speedup is up to 5.5%. llvm-svn: 347469	2018-11-22 15:47:44 +00:00
Andrea Di Biagio	13e1d20755	[llvm-mca] Use a SmallVector instead of std::vector to track register reads/writes. NFCI This avoids a heap allocation most of the times. This patch gives a small but consistent 3% speedup on a release build (up to ~5% on a debug build). llvm-svn: 347464	2018-11-22 14:48:53 +00:00
Andrea Di Biagio	1cb8a3c690	[llvm-mca] Fix an invalid memory read introduced by r346487. This patch fixes an invalid memory read introduced by r346487. Before this patch, partial register write had to query the latency of the dependent full register write by calling a method on the full write descriptor. However, if the full write is from an already retired instruction, chances are that the EntryStage already reclaimed its memory. In some parial register write tests, valgrind was reporting an invalid memory read. This change fixes the invalid memory access problem. Writes are now responsible for tracking dependent partial register writes, and notify them in the event of instruction issued. That means, partial register writes no longer need to query their associated full write to check when they are ready to execute. Added test X86/BtVer2/partial-reg-update-7.s llvm-svn: 347459	2018-11-22 12:48:57 +00:00
Andrea Di Biagio	dda9032314	[llvm-mca] Correctly update the resource strategy for processor resources with multiple units. When looking at the tests committed by Roman at r346587, I noticed that numbers reported by the resource pressure for PdAGU01 were wrong. In particular, according to the aut-generated CHECK lines in tests memcpy-like-test.s and store-throughput.s, resource pressure for PdAGU01 was not uniformly distributed among the two AGEN pipes. It turns out that the reason why pressure was not correctly distributed, was because the "resource selection strategy" object associated with PdAGU01 was not correctly updated on the event of AGEN pipe used. As a result, llvm-mca was not simulating a round-robin pipeline allocation for PdAGU01. Instead, PdAGU1 was always prioritized over PdAGU0. This patch fixes the issue; now processor resource strategy objects for resources declaring multiple units, are correctly notified in the event of "resource used". llvm-svn: 346650	2018-11-12 13:09:39 +00:00
Andrea Di Biagio	91bdf24cfd	[llvm-mca] Account for buffered resources when analyzing "Super" resources. This was noticed when working on PR3946. By construction, a group cannot be used as a "Super" resource. That constraint is enforced by method `SubtargetEmitter::ExpandProcResource()`. A Super resource S can be part of a group G. However, method `SubtargetEmitter::ExpandProcResource()` would not update the number of consumed resource cycles in G based on S. In practice, this is perfectly fine because the resource usage is correctly computed for processor resource units. However, llvm-mca should still check if G is a buffered resource. Before this patch, llvm-mca didn't correctly check if S was part of a group that defines a buffer. So, the instruction descriptor was not correctly set. For now, the semantic change introduced by this patch doesn't affect any of the upstream scheduling models. However, it will allow to make some progress on PR3946. llvm-svn: 346545	2018-11-09 19:30:20 +00:00
Andrea Di Biagio	dffec12f33	[llvm-mca] Use a small vector for instructions in the EntryStage. Use a simple SmallVector to track the lifetime of simulated instructions. An ordered map was not needed because instructions are already picked in program order. It is also much faster if we avoid searching for already retired instructions at the end of every cycle. The new policy only triggers a "garbage collection" when the number of retired instructions becomes significantly big when compared with the total size of the vector. While working on this, I noticed that instructions were correctly retired, but their internal state was not updated (i.e. there was no transition from the EXECUTED state, to the RETIRED state). While this was not a problem for the views, it prevented the EntryStage from correctly garbage collecting already retired instructions. That was a bad oversight, and this patch fixes it. The observed speedup on a debug build of llvm-mca after this patch is ~6%. On a release build of llvm-mca, the observed speedup is ~%15%. llvm-svn: 346487	2018-11-09 12:29:57 +00:00
Andrea Di Biagio	d66f4e472a	[llvm-mca] PR39261: Rename FetchStage to EntryStage. This fixes PR39261. FetchStage is a misnomer. It causes confusion with the frontend fetch stage, which we don't currently simulate. I decided to rename it into EntryStage mainly because this is meant to be a "source" stage for all pipelines. Differential Revision: https://reviews.llvm.org/D54268 llvm-svn: 346419	2018-11-08 17:49:30 +00:00
Andrea Di Biagio	fe3bc1b9bf	[llvm-mca] Add extra counters for move elimination in view RegisterFileStatistics. This patch teaches view RegisterFileStatistics how to report events for optimizable register moves. For each processor register file, view RegisterFileStatistics reports the following extra information: - Number of optimizable register moves - Number of register moves eliminated - Number of zero moves (i.e. register moves that propagate a zero) - Max Number of moves eliminated per cycle. Differential Revision: https://reviews.llvm.org/D53976 llvm-svn: 345865	2018-11-01 18:04:39 +00:00
Fangrui Song	5a8fd65700	[llvm-mca] Move namespace mca inside llvm:: Summary: This allows to remove `using namespace llvm;` in those .cpp files When we want to revisit the decision (everything resides in llvm::mca::) in the future, we can move things to a nested namespace of llvm::mca::, to conceptually make them separate from the rest of llvm::mca::* Reviewers: andreadb, mattd Reviewed By: andreadb Subscribers: javed.absar, tschuett, gbedwell, llvm-commits Differential Revision: https://reviews.llvm.org/D53407 llvm-svn: 345612	2018-10-30 15:56:08 +00:00
Andrea Di Biagio	df4d65dda1	[llvm-mca] Lower to mca::Instructon before the pipeline is run. Before this change, the lowering of instructions from llvm::MCInst to mca::Instruction was done as part of the first stage of the pipeline (i.e. the FetchStage). In particular, FetchStage was responsible for picking the next instruction from the source sequence, and lower it to an mca::Instruction with the help of an object of class InstrBuilder. The dependency on InstrBuilder was problematic for a number of reasons. Class InstrBuilder only knows how to lower from llvm::MCInst to mca::Instruction. That means, it is hard to support a different scenario where instructions in input are not instances of class llvm::MCInst. Even if we managed to specialize InstrBuilder, and generalize most of its internal logic, the dependency on InstrBuilder in FetchStage would have caused more troubles (other than complicating the pipeline logic). With this patch, the lowering step is done before the pipeline is run. The pipeline is no longer responsible for lowering from MCInst to mca::Instruction. As a consequence of this, the FetchStage no longer needs to interact with an InstrBuilder. The mca::SourceMgr class now simply wraps a reference to a sequence of mca::Instruction objects. This simplifies the logic of FetchStage, and increases the usability of it. As a result, on a debug build, we see a 7-9% speedup; on a release build, the speedup is around 3-4%. llvm-svn: 345500	2018-10-29 13:29:22 +00:00
Andrea Di Biagio	1e6d0aad7e	[llvm-mca] Introduce a new base class for mca::Instruction, and change how read/write information is stored. This patch introduces a new base class for Instruction named InstructionBase. Class InstructionBase is responsible for tracking data dependencies with the help of ReadState and WriteState objects. Class Instruction now derives from InstructionBase, and adds extra information related to the `InstrStage` as well as the `RCUTokenID`. ReadState and WriteState objects are no longer unique pointers. This avoids extra heap allocation and pointer checks that weren't really needed. Now, those objects are simply stored into SmallVectors. We use a SmallVector instead of a std::vector because we expect most instructions to only have a very small number of reads and writes. By using a simple SmallVector we also avoid extra heap allocations most of the time. In a debug build, this improves the performance of llvm-mca by roughly 10% (I still have to verify the impact in performance on a release build). llvm-svn: 345280	2018-10-25 17:03:51 +00:00
Andrea Di Biagio	77c26aebda	[llvm-mca] Removed a couple of redundant method declarations, and simplified code in ResourcePressureView. NFC llvm-svn: 345259	2018-10-25 11:51:34 +00:00
Matt Davis	b5d5debdbc	[llvm-mca] Replace InstRef::isValid with operator bool. NFC. llvm-svn: 345190	2018-10-24 20:27:47 +00:00
Andrea Di Biagio	cd4deea1c4	[llvm-mca] Simplify the logic in FetchStage. NFCI Only method 'getNextInstruction()' needs to interact with the SourceMgr. llvm-svn: 345185	2018-10-24 19:37:45 +00:00
Andrea Di Biagio	65c77d7283	[llvm-mca] Remove dependency from InstrBuilder in class InstructionTables. Also, removed the initialization of vectors used for processor resource masks. Support function 'computeProcResourceMasks()' already calls method resize on those vectors. No functional change intended. llvm-svn: 345161	2018-10-24 16:56:43 +00:00
Andrea Di Biagio	083addf751	[llvm-mca] [llvm-mca] Improved error handling and error reporting from class InstrBuilder. A new class named InstructionError has been added to Support.h in order to improve the error reporting from class InstrBuilder. The llvm-mca driver is responsible for handling InstructionError objects, and printing them out to stderr. The goal of this patch is to remove all the remaining error handling logic from the library code. In particular, this allows us to: - Simplify the logic in InstrBuilder by removing a needless dependency from MCInstrPrinter. - Centralize all the error halding logic in a new function named 'runPipeline' (see llvm-mca.cpp). This is also a first step towards generalizing class InstrBuilder, so that in future, we will be able to reuse its logic to also "lower" MachineInstr to mca::Instruction objects. Differential Revision: https://reviews.llvm.org/D53585 llvm-svn: 345129	2018-10-24 10:56:47 +00:00
Andrea Di Biagio	01b9fd6868	[llvm-mca] Use llvm::ArrayRef in class SourceMgr. NFCI Class SourceMgr now uses type ArrayRef<MCInst> to reference the sequence of code from a "CodeRegion". llvm-svn: 344911	2018-10-22 15:36:15 +00:00
Fangrui Song	2e83b2e9ee	Use llvm::{all,any,none}_of instead std::{all,any,none}_of. NFC llvm-svn: 344774	2018-10-19 06:12:02 +00:00
Andrea Di Biagio	6c17e80265	[llvm-mca] Correctly set aliases for register writes introduced by optimized register moves. This fixes a problem introduced by r344334. A write from a non-zero move eliminated at register renaming stage was not correctly handled by the PRF. This would have led to an assertion failure if the processor model declares a PRF that enables non-zero move elimination. llvm-svn: 344392	2018-10-12 18:18:53 +00:00
Andrea Di Biagio	6eebbe0a97	[tblgen][llvm-mca] Add the ability to describe move elimination candidates via tablegen. This patch adds the ability to identify instructions that are "move elimination candidates". It also allows scheduling models to describe processor register files that allow move elimination. A move elimination candidate is an instruction that can be eliminated at register renaming stage. Each subtarget can specify which instructions are move elimination candidates with the help of tablegen class "IsOptimizableRegisterMove" (see llvm/Target/TargetInstrPredicate.td). For example, on X86, BtVer2 allows both GPR and MMX/SSE moves to be eliminated. The definition of 'IsOptimizableRegisterMove' for BtVer2 looks like this: ``` def : IsOptimizableRegisterMove<[ InstructionEquivalenceClass<[ // GPR variants. MOV32rr, MOV64rr, // MMX variants. MMX_MOVQ64rr, // SSE variants. MOVAPSrr, MOVUPSrr, MOVAPDrr, MOVUPDrr, MOVDQArr, MOVDQUrr, // AVX variants. VMOVAPSrr, VMOVUPSrr, VMOVAPDrr, VMOVUPDrr, VMOVDQArr, VMOVDQUrr ], CheckNot<CheckSameRegOperand<0, 1>> > ]>; ``` Definitions of IsOptimizableRegisterMove from processor models of a same Target are processed by the SubtargetEmitter to auto-generate a target-specific override for each of the following predicate methods: ``` bool TargetSubtargetInfo::isOptimizableRegisterMove(const MachineInstr *MI) const; bool MCInstrAnalysis::isOptimizableRegisterMove(const MCInst &MI, unsigned CPUID) const; ``` By default, those methods return false (i.e. conservatively assume that there are no move elimination candidates). Tablegen class RegisterFile has been extended with the following information: - The set of register classes that allow move elimination. - Maxium number of moves that can be eliminated every cycle. - Whether move elimination is restricted to moves from registers that are known to be zero. This patch is structured in three part: A first part (which is mostly boilerplate) adds the new 'isOptimizableRegisterMove' target hooks, and extends existing register file descriptors in MC by introducing new fields to describe properties related to move elimination. A second part, uses the new tablegen constructs to describe move elimination in the BtVer2 scheduling model. A third part, teaches llm-mca how to query the new 'isOptimizableRegisterMove' hook to mark instructions that are candidates for move elimination. It also teaches class RegisterFile how to describe constraints on move elimination at PRF granularity. llvm-mca tests for btver2 show differences before/after this patch. Differential Revision: https://reviews.llvm.org/D53134 llvm-svn: 344334	2018-10-12 11:23:04 +00:00
Andrea Di Biagio	f455e3569f	[tblgen][CodeGenSchedule] Add a check for invalid RegisterFile definitions with zero physical registers. llvm-svn: 344235	2018-10-11 10:39:03 +00:00
Andrea Di Biagio	9efbfa88c3	[llvm-mca] Minor refactoring in preparation for a patch that will fully fix PR36671. NFCI llvm-svn: 344149	2018-10-10 16:08:02 +00:00
Andrea Di Biagio	2ee9f37fce	[llvm-mca] Move field 'AllowZeroMoveEliminationOnly' to class RegisterFile. NFC. Flag 'AllowZeroMoveEliminationOnly' should have been a property of the PRF, and not set at register granularity. This change also restricts move elimination to writes that update a full physical register. We assume that there is a strong correlation between logical registers that allow move elimination, and how those same registers are allocated to physical registers by the register renamer. This is still a no functional change, because this experimental code path is disabled for now. This is done in preparation for another patch that will add the ability to describe how move elimination works in scheduling models. llvm-svn: 343787	2018-10-04 15:20:56 +00:00
Andrea Di Biagio	aacd5e187b	[llvm-mca] Check for inconsistencies when constructing instruction descriptors. This should help with catching inconsistent definitions of instructions with zero opcodes, which also declare to consume scheduler/pipeline resources. llvm-svn: 343766	2018-10-04 10:36:49 +00:00
Andrea Di Biagio	207e0217f9	[llvm-mca] Add support for move elimination in class RegisterFile. This patch teaches class RegisterFile how to analyze register writes from instructions that are move elimination candidates. In particular, it teaches it how to check if a move can be effectively eliminated by the underlying PRF, and (if necessary) how to perform move elimination. The long term goal is to allow processor models to describe instructions that are valid move elimination candidates. The idea is to let register file definitions in tablegen declare if/when moves can be eliminated. This patch is a non functional change. The logic that performs move elimination is currently disabled. A future patch will add support for move elimination in the processor models, and enable this new code path. llvm-svn: 343691	2018-10-03 15:02:44 +00:00
Matt Davis	21d41dffe1	[llvm-mca] Constify the 'notify' routines. NFC. Also fixed up some whitespace formatting in DispatchStage.cpp. llvm-svn: 343615	2018-10-02 18:26:33 +00:00
Owen Rodley	31fddbac8f	[MCA] Remove SM.hasNext() call in FetchStage::execute. Summary: This is redundant, as FetchStage::getNextInstruction already checks this and returns llvm::ErrorSuccess() as appropriate. NFC. Reviewers: andreadb Subscribers: gbedwell, llvm-commits Differential Revision: https://reviews.llvm.org/D52642 llvm-svn: 343555	2018-10-02 00:40:08 +00:00
Matt Davis	8e2c75900e	[llvm-mca] Rename the 'Subtract' method to 'subtract' llvm-svn: 343549	2018-10-01 23:01:45 +00:00
Andrea Di Biagio	a7699127a9	[llvm-mca] Remove redundant namespace prefixes. NFC We are already "using" namespace llvm in all the files modified by this change. llvm-svn: 343312	2018-09-28 10:47:24 +00:00
Andrea Di Biagio	417ef40c39	[llvm-mca] Teach how to track zero registers in class RegisterFile. This change is in preparation for a future work on improving support for optimizable register moves. We already know if a write is from a zero-idiom, so we can propagate that bit of information to the PRF. We use an APInt mask to identify registers that are set to zero. llvm-svn: 343307	2018-09-28 09:42:06 +00:00
Fangrui Song	0cac726a00	llvm::sort(C.begin(), C.end(), ...) -> llvm::sort(C, ...) Summary: The convenience wrapper in STLExtras is available since rL342102. Reviewers: dblaikie, javed.absar, JDevlieghere, andreadb Subscribers: MatzeB, sanjoy, arsenm, dschuff, mehdi_amini, sdardis, nemanjai, jvesely, nhaehnle, sbc100, jgravelle-google, eraman, aheejin, kbarton, JDevlieghere, javed.absar, gbedwell, jrtc27, mgrang, atanasyan, steven_wu, george.burgess.iv, dexonsmith, kristina, jsji, llvm-commits Differential Revision: https://reviews.llvm.org/D52573 llvm-svn: 343163	2018-09-27 02:13:45 +00:00
Andrea Di Biagio	86502ddeaa	[llvm-mca] Improve code comments in LSUnit.{h, cpp}. NFC llvm-svn: 342877	2018-09-24 12:45:26 +00:00
Dean Michael Berris	92a05bfbf0	[MCA] Remove dependency on CodeGen. Summary: There isn't any actual dependency - there's one #include from CodeGen but nothing from the header is actually used. With this change we can use the MCA library from CodeGen without circular dependencies (e.g. for scheduling). Reviewers: andreadb Reviewed By: andreadb Authored By: orodley Subscribers: mgorny, gbedwell, llvm-commits Differential Revision: https://reviews.llvm.org/D52288 llvm-svn: 342706	2018-09-21 01:54:08 +00:00
Andrea Di Biagio	8b6c314be1	[TableGen][SubtargetEmitter] Add the ability for processor models to describe dependency breaking instructions. This patch adds the ability for processor models to describe dependency breaking instructions. Different processors may specify a different set of dependency-breaking instructions. That means, we cannot assume that all processors of the same target would use the same rules to classify dependency breaking instructions. The main goal of this patch is to provide the means to describe dependency breaking instructions directly via tablegen, and have the following TargetSubtargetInfo hooks redefined in overrides by tabegen'd XXXGenSubtargetInfo classes (here, XXX is a Target name). ``` virtual bool isZeroIdiom(const MachineInstr MI, APInt &Mask) const { return false; } virtual bool isDependencyBreaking(const MachineInstr MI, APInt &Mask) const { return isZeroIdiom(MI); } ``` An instruction MI is a dependency-breaking instruction if a call to method isDependencyBreaking(MI) on the STI (TargetSubtargetInfo object) evaluates to true. Similarly, an instruction MI is a special case of zero-idiom dependency breaking instruction if a call to STI.isZeroIdiom(MI) returns true. The extra APInt is used for those targets that may want to select which machine operands have their dependency broken (see comments in code). Note that by default, subtargets don't know about the existence of dependency-breaking. In the absence of external information, those method calls would always return false. A new tablegen class named STIPredicate has been added by this patch to let processor models classify instructions that have properties in common. The idea is that, a MCInstrPredicate definition can be used to "generate" an instruction equivalence class, with the idea that instructions of a same class all have a property in common. STIPredicate definitions are essentially a collection of instruction equivalence classes. Also, different processor models can specify a different variant of the same STIPredicate with different rules (i.e. predicates) to classify instructions. Tablegen backends (in this particular case, the SubtargetEmitter) will be able to process STIPredicate definitions, and automatically generate functions in XXXGenSubtargetInfo. This patch introduces two special kind of STIPredicate classes named IsZeroIdiomFunction and IsDepBreakingFunction in tablegen. It also adds a definition for those in the BtVer2 scheduling model only. This patch supersedes the one committed at r338372 (phabricator review: D49310). The main advantages are: - We can describe subtarget predicates via tablegen using STIPredicates. - We can describe zero-idioms / dep-breaking instructions directly via tablegen in the scheduling models. In future, the STIPredicates framework can be used for solving other problems. Examples of future developments are: - Teach how to identify optimizable register-register moves - Teach how to identify slow LEA instructions (each subtarget defining its own concept of "slow" LEA). - Teach how to identify instructions that have undocumented false dependencies on the output registers on some processors only. It is also (in my opinion) an elegant way to expose knowledge to both external tools like llvm-mca, and codegen passes. For example, machine schedulers in LLVM could reuse that information when internally constructing the data dependency graph for a code region. This new design feature is also an "opt-in" feature. Processor models don't have to use the new STIPredicates. It has all been designed to be as unintrusive as possible. Differential Revision: https://reviews.llvm.org/D52174 llvm-svn: 342555	2018-09-19 15:57:45 +00:00
Andrea Di Biagio	9f9cdd41cc	[llvm-mca] Add the ability to mark register reads/writes associated with dep-breaking instructions. NFCI This patch adds two new boolean fields: - Field `ReadState::IndependentFromDef`. - Field `WriteState::WritesZero`. Field `IndependentFromDef` is set for ReadState objects associated with dependency-breaking instructions. It is used by the simulator when updating data dependencies between registers. Field `WritesZero` is set by WriteState objects associated with dependency breaking zero-idiom instructions. It helps the PRF identify which writes don't consume any physical registers. llvm-svn: 342483	2018-09-18 15:00:06 +00:00
Andrea Di Biagio	afbc234b41	[llvm-mca] Slightly refactor class InstRef. NFC. llvm-svn: 342480	2018-09-18 14:03:46 +00:00
Nico Weber	b09a8c9bd9	Revert r342148 (and follow-on fix attempts r342154, r342180, r342182, r342193) Many bots buildling with make have been broken for several days, e.g. http://lab.llvm.org:8011/builders/lld-x86_64-darwin13 llvm-svn: 342336	2018-09-15 19:04:27 +00:00
Richard Diamond	f3063baa6e	Renovate CMake files in the `llvm-(cfi-verify\|exegesis\|mca)` tools. llvm-svn: 342148	2018-09-13 16:15:03 +00:00
Matt Davis	db834837c2	[llvm-mca] Delay calculation of Cycles per Resources, separate the cycles and resource quantities. Summary: This patch removes the storing of accumulated floating point data within the llvm-mca library. This patch splits-up the two quantities: cycles and number of resource units. By splitting-up these two quantities, we delay the calculation of "cycles per resource unit" until that value is read, reducing the chance of accumulating floating point error. I considered using the APFloat, but after measuring performance, for a large (many iteration) sample, I decided to go with this faster solution. Reviewers: andreadb, courbet, RKSimon Reviewed By: andreadb Subscribers: llvm-commits, javed.absar, tschuett, gbedwell Differential Revision: https://reviews.llvm.org/D51903 llvm-svn: 341980	2018-09-11 18:47:48 +00:00
Matt Davis	e0d03e9665	[llvm-mca] Fix typo in debug output. NFC. llvm-svn: 341281	2018-09-01 18:32:33 +00:00
Andrea Di Biagio	8b647dcf4b	[llvm-mca] Report the number of dispatched micro opcodes in the DispatchStatistics view. This patch introduces the following changes to the DispatchStatistics view: * DispatchStatistics now reports the number of dispatched opcodes instead of the number of dispatched instructions. * The "Dynamic Dispatch Stall Cycles" table now also reports the percentage of stall cycles against the total simulated cycles. This change allows users to easily compare dispatch group sizes with the processor DispatchWidth. Before this change, it was difficult to correlate the two numbers, since DispatchStatistics view reported numbers of instructions (instead of opcodes). DispatchWidth defines the maximum size of a dispatch group in terms of number of micro opcodes. The other change introduced by this patch is related to how DispatchStage generates "instruction dispatch" events. In particular: * There can be multiple dispatch events associated with a same instruction * Each dispatch event now encapsulates the number of dispatched micro opcodes. The number of micro opcodes declared by an instruction may exceed the processor DispatchWidth. Therefore, we cannot assume that instructions are always fully dispatched in a single cycle. DispatchStage knows already how to handle instructions declaring a number of opcodes bigger that DispatchWidth. However, DispatchStage always emitted a single instruction dispatch event (during the first simulated dispatch cycle) for instructions dispatched. With this patch, DispatchStage now correctly notifies multiple dispatch events for instructions that cannot be dispatched in a single cycle. A few views had to be modified. Views can no longer assume that there can only be one dispatch event per instruction. Tests (and docs) have been updated. Differential Revision: https://reviews.llvm.org/D51430 llvm-svn: 341055	2018-08-30 10:50:20 +00:00
Matt Davis	d9198907a6	[llvm-mca] Remove unused formal. NFC. llvm-svn: 340888	2018-08-29 00:41:04 +00:00
Matt Davis	15ecfbf1f6	[llvm-mca] Move the initialization of Pipeline. NFC. Code cleanup to make the pipeline creation routine easier to read. llvm-svn: 340887	2018-08-29 00:34:32 +00:00

1 2

52 Commits