llvm-project

Commit Graph

Author	SHA1	Message	Date
Andrea Di Biagio	39e5a5695f	[RFC][patch 3/3] Add support for variant scheduling classes in llvm-mca. This patch is the last of a sequence of three patches related to LLVM-dev RFC "MC support for variant scheduling classes". http://lists.llvm.org/pipermail/llvm-dev/2018-May/123181.html This fixes PR36672. The main goal of this patch is to teach llvm-mca how to solve variant scheduling classes. This patch does that, plus it adds new variant scheduling classes to the BtVer2 scheduling model to identify so-called zero-idioms (i.e. so-called dependency breaking instructions that are known to generate zero, and that are optimized out in hardware at register renaming stage). Without the BtVer2 change, this patch would not have had any meaningful tests. This patch is effectively the union of two changes: 1) a change that teaches llvm-mca how to resolve variant scheduling classes. 2) a change to the BtVer2 scheduling model that allows us to special-case packed XOR zero-idioms (this partially fixes PR36671). Differential Revision: https://reviews.llvm.org/D47374 llvm-svn: 333909	2018-06-04 15:43:09 +00:00
Andrea Di Biagio	2008c7c8fd	[llvm-mca] Track cycles contributed by resources that are in a 'Super' relationship. This is required if we want to correctly match the behavior of method SubtargetEmitter::ExpandProcResource() in Tablegen. When computing the set of "consumed" processor resources and resource cycles, the logic in ExpandProcResource() doesn't update the number of resource cycles contributed by a "Super" resource to a group. We need to take this into account when a model declares a processor resource which is part of a 'processor resource group', and it is also used as the "Super" of other resources. llvm-svn: 333892	2018-06-04 12:23:07 +00:00
Andrea Di Biagio	bdc670611b	[llvm-mca] Move the logic that computes the block throughput into Support.h. NFC This will allow us to share the logic that computes the block throughput with other views. llvm-svn: 333755	2018-06-01 14:35:21 +00:00
Andrea Di Biagio	4037011404	[llvm-mca] Fixed a problem caused by an invalid use of a processor resource mask in the Scheduler. The lambda functions used by method ResourceManager::mustIssueImmediately() was incorrectly truncating masks of buffered processor resources to 32-bit quantities. The invalid mask values were then used to access a map of processor resource descriptors. Fixes PR37643. llvm-svn: 333692	2018-05-31 20:27:46 +00:00
Matt Davis	aada043fa9	[llvm-mca] Update the header's guard name. NFC. This patch also places a comment at the end of the header guard. llvm-svn: 333297	2018-05-25 18:45:43 +00:00
Matt Davis	2d1d859c50	[llvm-mca] Update DispatchStage header comment. NFC. Updated the comment to be a wee bit more descriptive. llvm-svn: 333296	2018-05-25 18:31:28 +00:00
Matt Davis	5b79ffc5bc	[llvm-mca] Add the RetireStage. Summary: This class maintains the same logic as the original RetireControlUnit. This is just an intermediate patch to make the RCU a Stage. Future patches will remove the dependency on the DispatchStage, and then more properly populate the pre/execute/post Stage interface. Reviewers: andreadb, RKSimon, courbet Reviewed By: andreadb, courbet Subscribers: javed.absar, mgorny, tschuett, gbedwell, llvm-commits Differential Revision: https://reviews.llvm.org/D47244 llvm-svn: 333292	2018-05-25 18:00:25 +00:00
Andrea Di Biagio	0af811519a	[llvm-mca] Fix a rounding problem in SummaryView.cpp exposed by r333204. Before printing the block reciprocal throughput, ensure that the floating point number is always rounded the same way on every target. No functional change intended. llvm-svn: 333210	2018-05-24 17:22:14 +00:00
Matt Davis	6172c74696	[llvm-mca] Fix header comments. NFC. llvm-svn: 333096	2018-05-23 16:15:06 +00:00
Andrea Di Biagio	3fc20c9c7f	[llvm-mca] Print the "Block RThroughput" in the SummaryView. This patch implements the "block reciprocal throughput" computation in the SummaryView. The block reciprocal throughput is computed as the MAX of: - NumMicroOps / DispatchWidth - Resource Cycles / #Units (for every resource consumed). The block throughput is bounded from above by the hardware dispatch throughput. That is because the DispatchWidth is an upper bound on how many opcodes can be part of a single dispatch group. The block throughput is also limited by the amount of hardware parallelism. The number of available resource units affects how the resource pressure is distributed, and also how many blocks can be delivered every cycle. llvm-svn: 333095	2018-05-23 15:59:27 +00:00
Matt Davis	bd12532300	[llvm-mca] Move DispatchStage::cycleEvent to preExecute. NFC. Summary: This is an intermediate change, it moves the non-notification logic from Backend::notifyCycleBegin to runCycle(). Once the scheduler becomes part of the Execution stage the explicit call to Scheduler::cycleEvent will disappear. The logic for Dispatch::cycleEvent() can be in the preExecute phase, which this patch addresses. Reviewers: andreadb, RKSimon, courbet Reviewed By: andreadb Subscribers: tschuett, gbedwell, llvm-commits Differential Revision: https://reviews.llvm.org/D47213 llvm-svn: 333029	2018-05-22 20:51:58 +00:00
Andrea Di Biagio	cb1ed400a4	[llvm-mca] Removed an empty line generated by the timeline view. NFC. Also, regenerate all tests. llvm-svn: 332853	2018-05-21 17:11:56 +00:00
Matt Davis	679083e3d8	[llvm-mca] Make Dispatch a subclass of Stage. Summary: The logic of dispatch remains the same, but now DispatchUnit is a Stage (DispatchStage). This change has the benefit of simplifying the backend runCycle() code. The same logic applies, but it belongs to different components now. This is just a start, eventually we will need to remove the call to the DispatchStage in Scheduler.cpp, but that will be a separate patch. This change is mostly a renaming and moving of existing logic. This change also encouraged me to remove the Subtarget (STI) member from the Backend class. That member was used to initialize the other members of Backend and to eventually call DispatchUnit::dispatch(). Now that we have Stages, we can eliminate this by instantiating the DispatchStage with everything it needs at the time of construction (e.g., Subtarget). That change allows us to call DispatchStage::execute(IR) as we expect to call execute() for all other stages. Once we add the Stage list (D46907) we can more cleanly call preExecute() on all of the stages, DispatchStage, will probably wrap cycleEvent() in that case. Made some formatting and minor cleanups to README.txt. Some of the text was re-flowed to stay within 80 cols. Reviewers: andreadb, courbet, RKSimon Reviewed By: andreadb, courbet Subscribers: mgorny, javed.absar, tschuett, gbedwell, llvm-commits Differential Revision: https://reviews.llvm.org/D46983 llvm-svn: 332652	2018-05-17 19:22:29 +00:00
Andrea Di Biagio	55e9e0fefc	[llvm-mca] Hide unrelated flags from the -help output. llvm-svn: 332615	2018-05-17 15:35:14 +00:00
Andrea Di Biagio	650b5fc6cb	[llvm-mca] add flag -all-views and flag -all-stats. Flag -all-views enables all the views. Flag -all-stats enables all the views that print hardware statistics. llvm-svn: 332602	2018-05-17 12:27:03 +00:00
Matt Davis	b7972f88c7	[llvm-mca] Move the RegisterFile class into its own translation unit. NFC Summary: This change will help us turn the DispatchUnit into its own stage. Reviewers: andreadb, RKSimon, courbet Reviewed By: andreadb, courbet Subscribers: mgorny, tschuett, gbedwell, llvm-commits Differential Revision: https://reviews.llvm.org/D46916 llvm-svn: 332493	2018-05-16 17:07:08 +00:00
Andrea Di Biagio	076eba20bc	[llvm-mca] Move definitions in FetchStage.cpp inside namespace mca. NFC Also, get rid of a redundant include in FetchStage.h and FetchStage.cpp. llvm-svn: 332468	2018-05-16 13:38:17 +00:00
Andrea Di Biagio	88997ba27f	[llvm-mca] Fix perf regression after r332390. Revision 332390 introduced a FetchStage class in llvm-mca. By design, FetchStage owns all the instructions in-flight in the OoO Backend. Before this change, new instructions were added to a DenseMap indexed by instruction id. The problem with using a DenseMap is that elements are not ordered by key. This was causing a massive slow down in method FetchStage::postExecute(), which searches for instructions retired that can be deleted. This patch replaces the DenseMap with a std::map ordered by instruction index. At the end of every cycle, we search for the first instruction which is not marked as "retired", and we remove all the previous instructions before it. This works well because instructions are retired in-order. Before this patch, a debug build of llvm-mca (on my Ryzen linux machine) took ~8.0 seconds to simulate 3000 iterations of a x86 dot-product (a `vmulps, vpermilps, vaddps, vpermilps, vaddps` sequence). With this patch, it now takes ~0.8s to run all the 3000 iterations. llvm-svn: 332461	2018-05-16 12:33:09 +00:00
Andrea Di Biagio	ca0d30cd81	[llvm-mca] Remove redundant includes in Stage.h. This patch also makes Stage::isReady() a const method. No functional change. llvm-svn: 332443	2018-05-16 09:24:38 +00:00
Matt Davis	5d1cda1bc8	[llvm-mca] Introduce a pipeline Stage class and FetchStage. Summary: This is just an idea, really two ideas. I expect some push-back, but I realize that posting a diff is the most comprehensive way to express these concepts. This patch introduces a Stage class which represents the various stages of an instruction pipeline. As a start, I have created a simple FetchStage that is based on existing logic for how MCA produces instructions, but now encapsulated in a Stage. The idea should become more concrete once we introduce additional stages. The idea being, that when a stage completes, the next stage in the pipeline will be executed. Stages are chained together as a singly linked list to closely model a real pipeline. For now there is only one stage, so the stage-to-stage flow of instructions isn't immediately obvious. Eventually, Stage will also handle event notifications, but that functionality is not complete, and not destined for this patch. Ideally, an interested party can register for notifications from a particular stage. Callbacks will be issued to these listeners at various points in the execution of the stage. For now, eventing functionality remains similar to what it has been in mca::Backend. We will be building-up the Stage class as we move on, such as adding debug output. This patch also removes the unique_ptr<Instruction> return value from InstrBuilder::createInstruction. An Instruction pointer is still produced, but now it's up to the caller to decide how that item should be managed post-allocation (e.g., smart pointer). This allows the Fetch stage to create instructions and manage the lifetime of those instructions as it wishes, and not have to be bound to any specific managed pointer type. Other callers of createInstruction might have different requirements, and thus can manage the pointer to fit their needs. Another idea would be to push the ownership to the RCU. Currently, the FetchStage will wrap the Instruction pointer in a shared_ptr. This allows us to remove the Instruction container in Backend, which was probably going to disappear, or move, at some point anyways. Note that I did run these changes through valgrind, to make sure we are not leaking memory. While the shared_ptr comes with some additional overhead it relieves us from having to manage a list of generated instructions, and/or make lookup calls to remove the instructions. I realize that both the Stage class and the Instruction pointer management (mentioned directly above) are separate but related ideas, and probably should land as separate patches; I am happy to do that if either idea is decent. The main reason these two ideas are together is that Stage::execute() can mutate an InstRef. For the fetch stage, the InstRef is populated as the primary action of that stage (execute()). I didn't want to change the Stage interface to support the idea of generating an instruction. Ideally, instructions are to be pushed through the pipeline. I didn't want to draw too much of a specialization just for the fetch stage. Excuse the word-salad. Reviewers: andreadb, courbet, RKSimon Reviewed By: andreadb Subscribers: llvm-commits, mgorny, javed.absar, tschuett, gbedwell Differential Revision: https://reviews.llvm.org/D46741 llvm-svn: 332390	2018-05-15 20:21:04 +00:00
Andrea Di Biagio	039349a643	[llvm-mca] use a formatted_raw_ostream to insert padding and get rid of tabs. NFC llvm-svn: 332381	2018-05-15 18:11:45 +00:00
Andrea Di Biagio	a7c3c45267	[llvm-mca] Strip leading tabs and spaces from instruction strings before printing. NFC llvm-svn: 332361	2018-05-15 15:18:05 +00:00
Andrea Di Biagio	904684cf5c	[llvm-mca] Remove unused include header files. NFC Also, run clang-format on RetireControlUnit.cpp. llvm-svn: 332337	2018-05-15 10:30:39 +00:00
Andrea Di Biagio	e2492c860a	[llvm-mca] Add file header to RetireControlUnit.cpp. Strictly speaking, this is not necessary for .cpp files. However, other .cpp files from this same tool have it. This also matches what we do in other tools. llvm-svn: 332334	2018-05-15 09:31:32 +00:00
Andrea Di Biagio	8ea3a34e39	[llvm-mca] Improved support for dependency-breaking instructions. The tool assumes that a zero-latency instruction that doesn't consume hardware resources is an optimizable dependency-breaking instruction. That means, it doesn't have to wait on register input operands, and it doesn't consume any physical register. The PRF knows how to optimize it at register renaming stage. llvm-svn: 332249	2018-05-14 15:08:22 +00:00
Nicola Zaghen	d34e60ca85	Rename DEBUG macro to LLVM_DEBUG. The DEBUG() macro is very generic so it might clash with other projects. The renaming was done as follows: - git grep -l 'DEBUG' \| xargs sed -i 's/\bDEBUG\s\?(/LLVM_DEBUG(/g' - git diff -U0 master \| ../clang/tools/clang-format/clang-format-diff.py -i -p1 -style LLVM - Manual change to APInt - Manually chage DOCS as regex doesn't match it. In the transition period the DEBUG() macro is still present and aliased to the LLVM_DEBUG() one. Differential Revision: https://reviews.llvm.org/D43624 llvm-svn: 332240	2018-05-14 12:53:11 +00:00
David Blaikie	c30365ce1d	Move standard library inclusions to after internal inclusions. llvm-svn: 332124	2018-05-11 19:21:40 +00:00
David Blaikie	1ca61f6e1d	llvm-mca: Add missing includes Move the header include in the primary source file to the top to validate that it doesn't depend on any other inclusions. llvm-svn: 331897	2018-05-09 17:28:10 +00:00
Matt Davis	21a8d32307	[llvm-mca] Avoid exposing index values in the MCA interfaces. Summary: This patch eliminates many places where we originally needed to pass index values to represent an instruction. The index is still used as a key, in various parts of MCA. I'm not comfortable eliminating the index just yet. By burying the index in the instruction, we can avoid exposing that value in many places. Eventually, we should consider removing the Instructions list in the Backend all together, it's only used to hold and reclaim the memory for the allocated Instruction instances. Instead we could pass around a smart pointer. But that's a separate discussion/patch. Reviewers: andreadb, courbet, RKSimon Reviewed By: andreadb Subscribers: javed.absar, tschuett, gbedwell, llvm-commits Differential Revision: https://reviews.llvm.org/D46367 llvm-svn: 331660	2018-05-07 18:29:15 +00:00
Andrea Di Biagio	450ea7aed3	[llvm-mca] removes flag -instruction-tables from the "View Options" category. This patch also improves the description of a couple of flags in the view options. With this change, the -help now specifies which views are enabled by default. llvm-svn: 331594	2018-05-05 15:36:47 +00:00
Andrea Di Biagio	7bf825618c	[llvm-mca] minor tweak to the resource pressure printing functionality. NFC. llvm-svn: 331590	2018-05-05 12:21:54 +00:00
Matt Davis	35df8b24af	[llvm-mca] Add descriptive names for the TimelineView report characters. NFC. Summary: This change makes the TimelineView source simpler to read and easier to modify in the future. This patch introduces a class of static chars used as the display values in the TimelineView report, this change just eliminates a few magic characters. Reviewers: andreadb, courbet, RKSimon Reviewed By: andreadb Subscribers: tschuett, gbedwell, llvm-commits Differential Revision: https://reviews.llvm.org/D46409 llvm-svn: 331540	2018-05-04 17:19:40 +00:00
Andrea Di Biagio	24fb4fcb93	[llvm-mca] use colors for warnings and notes generated by InstrBuilder. llvm-svn: 331517	2018-05-04 13:52:12 +00:00
Andrea Di Biagio	49c8591397	[llvm-mca] remove unused argument from method InstrBuilder::createInstrDescImpl. We don't need to pass the instruction index to the method that constructs new instruction descriptors. No functional change intended. llvm-svn: 331516	2018-05-04 13:10:10 +00:00
Matt Davis	6aa5dcdcb2	[llvm-mca] Lift the logic of the RetireControlUnit from the Dispatch translation unit into its own translation unit. NFC The logic remains the same. Eventually, I see the RCU acting as its own separate stage in the instruction pipeline. Differential Revision: https://reviews.llvm.org/D46331 llvm-svn: 331316	2018-05-01 23:04:01 +00:00
Adrian Prantl	5f8f34e459	Remove \brief commands from doxygen comments. We've been running doxygen with the autobrief option for a couple of years now. This makes the \brief markers into our comments redundant. Since they are a visual distraction and we don't want to encourage more \brief markers in new code either, this patch removes them all. Patch produced by for i in $(git grep -l '\\brief'); do perl -pi -e 's/\\brief //g' $i & done Differential Revision: https://reviews.llvm.org/D46290 llvm-svn: 331272	2018-05-01 15:54:18 +00:00
Andrea Di Biagio	e047d3529b	[llvm-mca] Correctly handle zero-latency stores that consume pipeline resources. This fixes PR37293. We can have scheduling classes with no write latency entries, that still consume processor resources. We don't want to treat those instructions as zero-latency instructions; they still have to be issued to the underlying pipelines, so they still consume resource cycles. This is likely to be a regression which I have accidentally introduced at revision 330807. Now, if an instruction has a non-empty set of write processor resources, we conservatively treat it as a normal (i.e. non zero-latency) instruction. llvm-svn: 331193	2018-04-30 15:55:04 +00:00
Andrea Di Biagio	e9384eb13b	[llvm-mca] Support for in-order CPU for -instruction-tables testing. Added Intel Atom tests to verify that the tool correctly generates instruction tables even if the CPU is in-order. Fixes PR37282. llvm-svn: 331169	2018-04-30 12:05:34 +00:00
Matt Davis	ad78e6673c	[MCA] [NFC] Remove unused Index formal from ResourceManager::issueInstruction Summary: The instruction index was never referenced in the body. Just a minor cleanup. Reviewers: andreadb Reviewed By: andreadb Subscribers: javed.absar, gbedwell, llvm-commits Differential Revision: https://reviews.llvm.org/D46142 llvm-svn: 331001	2018-04-26 22:30:40 +00:00
Filipe Cabecinhas	def742ca52	[llvm-mca] Make ViewOptions static. NFCI llvm-svn: 330829	2018-04-25 14:39:16 +00:00
Andrea Di Biagio	534e1dab79	[llvm-mca] Add a new option category for views. With this patch, options to add/tweak views are all grouped together in the -help output. The new "View Options" category looks like this: ``` View Options: -dispatch-stats - Print dispatch statistics -instruction-info - Print the instruction info view -instruction-tables - Print instruction tables -register-file-stats - Print register file statistics -resource-pressure - Print the resource pressure view -retire-stats - Print retire control unit statistics -scheduler-stats - Print scheduler statistics -timeline - Print the timeline view -timeline-max-cycles=<uint> - Maximum number of cycles in the timeline view. Defaults to 80 cycles -timeline-max-iterations=<uint> - Maximum number of iterations to print in timeline view ``` llvm-svn: 330816	2018-04-25 11:33:14 +00:00
Andrea Di Biagio	641cca3ddf	[llvm-mca] run clang-format on a bunch of files. NFC llvm-svn: 330811	2018-04-25 10:27:30 +00:00
Andrea Di Biagio	93c49d5e58	[llvm-mca] Default to the native host cpu if flag -mcpu is not specified. llvm-svn: 330809	2018-04-25 10:18:25 +00:00
Andrea Di Biagio	db66efcb6a	[llvm-mca] Remove method Instruction::isZeroLatency(). NFCI llvm-svn: 330807	2018-04-25 09:38:58 +00:00
Andrea Di Biagio	ba625f0a86	[llvm-mca] Remove unused flag -verbose. NFC I forgot to remove it at r329794. llvm-svn: 330757	2018-04-24 19:14:56 +00:00
Andrea Di Biagio	0626864fa4	[llvm-mca] Default the output asm dialect used by the instruction printer to the input asm dialect. The instruction printer used by llvm-mca to generate the performance report now defaults the output assembly format to the format used for the input assembly file. On x86, the asm format can be either AT&T or Intel, depending on the presence/absence of directive `.intel_syntax`. Users can still specify a different assembly dialect with the command line flag -output-asm-variant=<uint>. llvm-svn: 330733	2018-04-24 16:19:08 +00:00
Andrea Di Biagio	27c4b09626	[llvm-mca] Refactor the Scheduler interface in preparation for PR36663. Zero latency instructions are now scheduled the same way as other instructions. Before this patch, there was a specialzed code path for those instructions. All scheduler events are now generated from method `scheduleInstruction()` and from method `cycleEvent()`. This will make easier to implement a "execution stage", and let that stage publish all the scheduler events. No functional change intended. llvm-svn: 330723	2018-04-24 14:53:16 +00:00
Jonas Devlieghere	6adef09891	[llvm-mca] Use WithColor for printing errors Use convenience helpers in WithColor to print errors and notes. Differential revision: https://reviews.llvm.org/D45666 llvm-svn: 330267	2018-04-18 15:26:51 +00:00
Rui Ueyama	197194b6c9	Define InitLLVM to do common initialization all at once. We have a few functions that virtually all command wants to run on process startup/shutdown. This patch adds InitLLVM class to do that all at once, so that we don't need to copy-n-paste boilerplate code to each llvm command's main() function. Differential Revision: https://reviews.llvm.org/D45602 llvm-svn: 330046	2018-04-13 18:26:06 +00:00
Andrea Di Biagio	c752616f30	[llvm-mca] Ensure that instructions with a schedule read-advance are always issued in the right order. Normally, the Scheduler prioritizes older instructions over younger instructions during the instruction issue stage. In one particular case where a dependent instruction had a schedule read-advance associated to one of the input operands, this rule was not correctly applied. This patch fixes the issue and adds a test to verify that we don't regress that particular case. llvm-svn: 330032	2018-04-13 15:19:07 +00:00

1 2 3

121 Commits