llvm-project

Commit Graph

Author	SHA1	Message	Date
Chad Rosier	f575a73751	Revert "MachineScheduler: better book-keeping for asserts." This reverts commit r212088, which is causing a number of spec failures. Will provide reduced test cases shortly. PR20057 llvm-svn: 212109	2014-07-01 17:23:11 +00:00
Andrew Trick	f1b307bcb0	MachineScheduler: better book-keeping for asserts. Fixes another test case under PR20057. llvm-svn: 212088	2014-07-01 03:23:13 +00:00
Andrew Trick	040c0da578	Left out the NDEBUG in the previous checkin. llvm-svn: 211867	2014-06-27 05:09:36 +00:00
Andrew Trick	5632722cab	MachineScheduler: add some book-keeping to fix an assert. Fixe for Bug 20057 - Assertion failied in llvm::SUnit* llvm::SchedBoundary::pickOnlyChoice(): Assertion `i <= (HazardRec->getMaxLookAhead() + MaxObservedStall) && "permanent hazard"' Thanks to Chad for the test case. llvm-svn: 211865	2014-06-27 04:57:05 +00:00
Alp Toker	e69170a110	Revert "Introduce a string_ostream string builder facilty" Temporarily back out commits r211749, r211752 and r211754. llvm-svn: 211814	2014-06-26 22:52:05 +00:00
Alp Toker	614717388c	Introduce a string_ostream string builder facilty string_ostream is a safe and efficient string builder that combines opaque stack storage with a built-in ostream interface. small_string_ostream<bytes> additionally permits an explicit stack storage size other than the default 128 bytes to be provided. Beyond that, storage is transferred to the heap. This convenient class can be used in most places an std::string+raw_string_ostream pair or SmallString<>+raw_svector_ostream pair would previously have been used, in order to guarantee consistent access without byte truncation. The patch also converts much of LLVM to use the new facility. These changes include several probable bug fixes for truncated output, a programming error that's no longer possible with the new interface. llvm-svn: 211749	2014-06-26 00:00:48 +00:00
Andrew Trick	491e34a139	Fix the scheduler's MaxObservedStall computation. WenHan Gu pointed out this bug that results in an assert not being effective in some cases. llvm-svn: 210846	2014-06-12 22:36:28 +00:00
Andrew Trick	7f1ebbeb8f	Fix the MachineScheduler's logic for updating ready times for in-order. Now the scheduler updates a node's ready time as soon as it is scheduled, before releasing dependent nodes. There was a reason I didn't do this initially but it no longer applies. A53 is in-order and was running into an issue where nodes where added to the readyQ too early. That's now fixed. This also makes it easier for custom scheduling strategies to build heuristics based on the actual cycles that the node was scheduled at. The only impact on OOO (sandybridge/cyclone) is that ready times will be slightly more accurate. I didn't measure any significant regressions. llvm-svn: 210390	2014-06-07 01:48:43 +00:00
Andrew Trick	8d2ee37f31	Add a subtarget hook: enablePostMachineScheduler. As requested by AArch64 subtargets. Note that this will have no effect until the AArch64 target actually enables the pass like this: substitutePass(&PostRASchedulerID, &PostMachineSchedulerID); As soon as armv7 switches over, PostMachineScheduler will become the default postRA scheduler, so this won't be necessary any more. Targets using the old postRA schedule would then do: substitutePass(&PostMachineSchedulerID, &PostRASchedulerID); llvm-svn: 210167	2014-06-04 07:06:27 +00:00
Andrew Trick	3ccf71d4d6	Move GenericScheduler and PostGenericScheduler into a header. These were not exposed previously because I didn't want out-of-tree targets to be too dependent on their internals. They can be reused for a very wide variety of processors with casual scheduling needs without exposing the classes by instead using hooks defined in MachineSchedPolicy (we can add more if needed). When targets are more aggressively tuned or want to provide custom heuristics, they can define their own MachineSchedStrategy. I tend to think this is better once you start customizing heuristics because you can copy over only what you need. I don't think that layering heuristics generally works well. However, Arch64 targets now want to reuse the Generic scheduling logic but also provide extensions. I don't see much harm in exposing the Generic scheduling classes with a major caveat: these scheduling strategies may change in the future without validating performance on less mainstream processors. If you want to be immune from changes, just define your own MachineSchedStrategy. llvm-svn: 210166	2014-06-04 07:06:18 +00:00
Craig Topper	9d74a5a5f1	[C++11] Add 'override' keywords and remove 'virtual'. Additionally add 'final' and leave 'virtual' on some methods that are marked virtual without overriding anything and have no obvious overrides themselves. llvm-svn: 207511	2014-04-29 07:58:41 +00:00
Chandler Carruth	1b9dde087e	[Modules] Remove potential ODR violations by sinking the DEBUG_TYPE define below all header includes in the lib/CodeGen/... tree. While the current modules implementation doesn't check for this kind of ODR violation yet, it is likely to grow support for it in the future. It also removes one layer of macro pollution across all the included headers. Other sub-trees will follow. llvm-svn: 206837	2014-04-22 02:02:50 +00:00
David Blaikie	422b93dcf1	Use unique_ptr to manage objects owned by the ScheduleDAGMI. llvm-svn: 206784	2014-04-21 20:32:32 +00:00
Craig Topper	c0196b1b40	[C++11] More 'nullptr' conversion. In some cases just using a boolean check instead of comparing to nullptr. llvm-svn: 206142	2014-04-14 00:51:57 +00:00
Paul Robinson	7c99ec5b99	Disable each MachineFunctionPass for 'optnone' functions, unless that pass normally runs at optimization level None, or is part of the register allocation pipeline. llvm-svn: 205228	2014-03-31 17:43:35 +00:00
Craig Topper	24e685fdb0	[C++11] Remove 'virtual' keyword from methods marked with 'override' keyword. llvm-svn: 203444	2014-03-10 05:29:18 +00:00
Benjamin Kramer	b0f74b24fa	[C++11] Convert sort predicates into lambdas. No functionality change. llvm-svn: 203288	2014-03-07 21:35:39 +00:00
Craig Topper	4584cd54e3	[C++11] Add 'override' keyword to virtual methods that override their base class. llvm-svn: 203220	2014-03-07 09:26:03 +00:00
Ahmed Charles	56440fd820	Replace OwningPtr<T> with std::unique_ptr<T>. This compiles with no changes to clang/lld/lldb with MSVC and includes overloads to various functions which are used by those projects and llvm which have OwningPtr's as parameters. This should allow out of tree projects some time to move. There are also no changes to libs/Target, which should help out of tree targets have time to move, if necessary. llvm-svn: 203083	2014-03-06 05:51:42 +00:00
Benjamin Kramer	b6d0bd48bd	[C++11] Replace llvm::next and llvm::prior with std::next and std::prev. Remove the old functions. llvm-svn: 202636	2014-03-02 12:27:27 +00:00
Craig Topper	73156025e0	Switch all uses of LLVM_OVERRIDE to just use 'override' directly. llvm-svn: 202621	2014-03-02 09:09:27 +00:00
Alp Toker	cb40291100	Fix known typos Sweep the codebase for common typos. Includes some changes to visible function names that were misspelt. llvm-svn: 200018	2014-01-24 17:20:08 +00:00
Andrew Trick	4675351afd	Reformat a loop for basic hygeine. Self review. llvm-svn: 199788	2014-01-22 03:38:55 +00:00
Andrew Trick	350ff2c084	Fix PR18572 - llc crash during GenericScheduler::initPolicy(). Generalized the heuristic that looks at the (very rough) size of the register file before enabling regpressure tracking. llvm-svn: 199766	2014-01-21 21:27:37 +00:00
Saleem Abdulrasool	7230b377df	CodeGen: silence a C++11 feature warning llvm-svn: 198133	2013-12-28 22:47:55 +00:00
Andrew Trick	7afe481801	Uninitialized variable (in never taken path) after factoring. llvm-svn: 198131	2013-12-28 22:25:57 +00:00
Andrew Trick	33e05d7665	Added debugging options: -misched-only-func/block llvm-svn: 198124	2013-12-28 21:57:02 +00:00
Andrew Trick	d14d7c20f5	Add a PostMachineScheduler pass with generic implementation. PostGenericScheduler uses either the new machine model or the hazard checker for top-down scheduling. Most of the infrastructure for PreRA machine scheduling is reused. With a some tuning, this should allow MachineScheduler to be default for all ARM targets, including cortex-A9, using the new machine model. Likewise, with additional tuning, it should be able to replace PostRAScheduler for all targets. The PostMachineScheduler pass does not currently run the AntiDepBreaker. There is less need for it on targets that are already running preRA MachineScheduler. I want to prove it's necessary before committing to the maintenance burden. The PostMachineScheduler also currently removes kill flags and adds them all back later. This is a bit ridiculous. I'd prefer passes to directly use a liveness utility than rely on flags. A test case that enables this scheduler will be included in a subsequent checkin that updates the A9 model. llvm-svn: 198122	2013-12-28 21:56:57 +00:00
Andrew Trick	17080b9bf2	Stub out a PostMachineScheduler pass. Placeholder and boilerplate for a PostRA MachineScheduler pass. llvm-svn: 198120	2013-12-28 21:56:51 +00:00
Andrew Trick	d7f890edb0	Factor MI-Sched in preparation for post-ra scheduling support. Factor the MachineFunctionPass into MachineSchedulerBase. Split the DAG class into ScheduleDAGMI and SchedulerDAGMILive. llvm-svn: 198119	2013-12-28 21:56:47 +00:00
Andrew Trick	fc127d1197	Factor out the SchedRemainder/SchedBoundary from GenericScheduler strategy. These helper classes take care of the book-keeping the drives the GenericScheduler heuristics. It is likely that developers writing target-specific schedulers that work similarly to GenericScheduler will want to use these helpers too. The immediate goal is to develop a GenericPostScheduler that can run in place of the old PostRAScheduler, but will use the new machine model. No functionality change intended. llvm-svn: 196643	2013-12-07 05:59:44 +00:00
Andrew Trick	f7760a24e5	comment grammar llvm-svn: 196585	2013-12-06 17:19:20 +00:00
Daniel Jasper	0d92abdfd2	Fix bug introduced in r196517. Not only does it trigger -Wparentheses, I think the assert actually relies on incorrect operator precedence. Also, the grammar as questionable, but I might not know enough about the problem at hand. llvm-svn: 196567	2013-12-06 08:58:22 +00:00
Andrew Trick	5a22df498e	MI-Sched: Model "reserved" processor resources. This allows a target to use MI-Sched as an in-order scheduler that will model strict resource conflicts without defining a processor itinerary. Instead, the target can now use the new per-operand machine model and define in-order resources with BufferSize=0. For example, this would allow restricting the type of operations that can be formed into a dispatch group. (Normally NumMicroOps is sufficient to enforce dispatch groups). If the intent is to model latency in in-order pipeline, as opposed to resource conflicts, then a resource with BufferSize=1 should be defined instead. This feature is only casually tested as there are no in-tree targets using it yet. However, Hal will be experimenting with POWER7. llvm-svn: 196517	2013-12-05 17:56:02 +00:00
Andrew Trick	880e573d98	MI-Sched: handle latency of in-order operations with the new machine model. The per-operand machine model allows the target to define "unbuffered" processor resources. This change is a quick, cheap way to model stalls caused by the latency of operations that use such resources. This only applies when the processor's micro-op buffer size is non-zero (Out-of-Order). We can't precisely model in-order stalls during out-of-order execution, but this is an easy and effective heuristic. It benefits cortex-a9 scheduling when using the new machine model, which is not yet on by default. MI-Sched for armv7 was evaluated on Swift (and only not enabled because of a performance bug related to predication). However, we never evaluated Cortex-A9 performance on MI-Sched in its current form. This change adds MI-Sched functionality to reach performance goals on A9. The only remaining change is to allow MI-Sched to run as a PostRA pass. I evaluated performance using a set of options to estimate the performance impact once MI sched is default on armv7: -mcpu=cortex-a9 -disable-post-ra -misched-bench -scheditins=false For a simple saxpy loop I see a 1.7x speedup. Here are the llvm-testsuite results: (min run time over 2 runs, filtering tiny changes) Speedups: \| Benchmarks/BenchmarkGame/recursive \| 52.39% \| \| Benchmarks/VersaBench/beamformer \| 20.80% \| \| Benchmarks/Misc/pi \| 19.97% \| \| Benchmarks/Misc/mandel-2 \| 19.95% \| \| SPEC/CFP2000/188.ammp \| 18.72% \| \| Benchmarks/McCat/08-main/main \| 18.58% \| \| Benchmarks/Misc-C++/Large/sphereflake \| 18.46% \| \| Benchmarks/Olden/power \| 17.11% \| \| Benchmarks/Misc-C++/mandel-text \| 16.47% \| \| Benchmarks/Misc/oourafft \| 15.94% \| \| Benchmarks/Misc/flops-7 \| 14.99% \| \| Benchmarks/FreeBench/distray \| 14.26% \| \| SPEC/CFP2006/470.lbm \| 14.00% \| \| mediabench/mpeg2/mpeg2dec/mpeg2decode \| 12.28% \| \| Benchmarks/SmallPT/smallpt \| 10.36% \| \| Benchmarks/Misc-C++/Large/ray \| 8.97% \| \| Benchmarks/Misc/fp-convert \| 8.75% \| \| Benchmarks/Olden/perimeter \| 7.10% \| \| Benchmarks/Bullet/bullet \| 7.03% \| \| Benchmarks/Misc/mandel \| 6.75% \| \| Benchmarks/Olden/voronoi \| 6.26% \| \| Benchmarks/Misc/flops-8 \| 5.77% \| \| Benchmarks/Misc/matmul_f64_4x4 \| 5.19% \| \| Benchmarks/MiBench/security-rijndael \| 5.15% \| \| Benchmarks/Misc/flops-6 \| 5.10% \| \| Benchmarks/Olden/tsp \| 4.46% \| \| Benchmarks/MiBench/consumer-lame \| 4.28% \| \| Benchmarks/Misc/flops-5 \| 4.27% \| \| Benchmarks/mafft/pairlocalalign \| 4.19% \| \| Benchmarks/Misc/himenobmtxpa \| 4.07% \| \| Benchmarks/Misc/lowercase \| 4.06% \| \| SPEC/CFP2006/433.milc \| 3.99% \| \| Benchmarks/tramp3d-v4 \| 3.79% \| \| Benchmarks/FreeBench/pifft \| 3.66% \| \| Benchmarks/Ptrdist/ks \| 3.21% \| \| Benchmarks/Adobe-C++/loop_unroll \| 3.12% \| \| SPEC/CINT2000/175.vpr \| 3.12% \| \| Benchmarks/nbench \| 2.98% \| \| SPEC/CFP2000/183.equake \| 2.91% \| \| Benchmarks/Misc/perlin \| 2.85% \| \| Benchmarks/Misc/flops-1 \| 2.82% \| \| Benchmarks/Misc-C++-EH/spirit \| 2.80% \| \| Benchmarks/Misc/flops-2 \| 2.77% \| \| Benchmarks/NPB-serial/is \| 2.42% \| \| Benchmarks/ASC_Sequoia/CrystalMk \| 2.33% \| \| Benchmarks/BenchmarkGame/n-body \| 2.28% \| \| Benchmarks/SciMark2-C/scimark2 \| 2.27% \| \| Benchmarks/Olden/bh \| 2.03% \| \| skidmarks10/skidmarks \| 1.81% \| \| Benchmarks/Misc/flops \| 1.72% \| Slowdowns: \| Benchmarks/llubenchmark/llu \| -14.14% \| \| Benchmarks/Polybench/stencils/seidel-2d \| -5.67% \| \| Benchmarks/Adobe-C++/functionobjects \| -5.25% \| \| Benchmarks/Misc-C++/oopack_v1p8 \| -5.00% \| \| Benchmarks/Shootout/hash \| -2.35% \| \| Benchmarks/Prolangs-C++/ocean \| -2.01% \| \| Benchmarks/Polybench/medley/floyd-warshall \| -1.98% \| \| Polybench/linear-algebra/kernels/3mm \| -1.95% \| \| Benchmarks/McCat/09-vor/vor \| -1.68% \| llvm-svn: 196516	2013-12-05 17:55:58 +00:00
Andrew Trick	bb1247b9f0	comment typo and reformat llvm-svn: 196513	2013-12-05 17:55:47 +00:00
Juergen Ributzka	d12ccbd343	[weak vtables] Remove a bunch of weak vtables This patch removes most of the trivial cases of weak vtables by pinning them to a single object file. The memory leaks in this version have been fixed. Thanks Alexey for pointing them out. Differential Revision: http://llvm-reviews.chandlerc.com/D2068 Reviewed by Andy llvm-svn: 195064	2013-11-19 00:57:56 +00:00
Alexey Samsonov	49109a279c	Revert r194865 and r194874. This change is incorrect. If you delete virtual destructor of both a base class and a subclass, then the following code: Base *foo = new Child(); delete foo; will not cause the destructor for members of Child class. As a result, I observe plently of memory leaks. Notable examples I investigated are: ObjectBuffer and ObjectBufferStream, AttributeImpl and StringSAttributeImpl. llvm-svn: 194997	2013-11-18 09:31:53 +00:00
Juergen Ributzka	dbedae89b9	[weak vtables] Remove a bunch of weak vtables This patch removes most of the trivial cases of weak vtables by pinning them to a single object file. Differential Revision: http://llvm-reviews.chandlerc.com/D2068 Reviewed by Andy llvm-svn: 194865	2013-11-15 22:34:48 +00:00
Matthias Braun	88dd0abd2d	Pass LiveQueryResult by value This makes the API a bit more natural to use and makes it easier to make LiveRanges implementation details private. llvm-svn: 192394	2013-10-10 21:28:52 +00:00
Andrew Trick	dc4c1adfc7	Comment typo. llvm-svn: 191312	2013-09-24 17:11:19 +00:00
Andrew Trick	978674b2bc	Allow subtarget selection of the default MachineScheduler and document the interface. The global registry is used to allow command line override of the scheduler selection, but does not work well as the normal selection API. For example, the same LLVM process should be able to target multiple targets or subtargets. llvm-svn: 191071	2013-09-20 05:14:41 +00:00
Andrew Trick	665d3ec3d3	Rename ConvergingScheduler to GenericScheduler. This was an experimental scheduler a year ago. It's now used by several subtargets, both in-order and out-of-order, and it is about to be enabled by default for x86 and armv7. It will be the new GenericScheduler for subtargets that don't provide their own SchedulingStrategy. llvm-svn: 191051	2013-09-19 23:10:59 +00:00
Andrew Trick	6c88b35090	Enable -misched-cyclicpath by default. llvm-svn: 190367	2013-09-09 23:31:14 +00:00
Andrew Trick	e1f7bf2c02	mi-sched: smooth out the cyclicpath heuristic. Arnold's idea. I generally try to avoid stateful heuristics because it can make debugging harder. However, we need a way to prevent the latency priority from dominating, and it somewhat makes sense to schedule aggressively for latency only within an issue group. Swift in particular likes this, and it doesn't hurt anyone else: \| Benchmarks/MiBench/consumer-lame \| 10.39% \| \| Benchmarks/Misc/himenobmtxpa \| 9.63% \| llvm-svn: 190360	2013-09-09 22:28:08 +00:00
Andrew Trick	b248b4a1de	mi-sched: cleanup register pressure update, remove a FIXME. llvm-svn: 190181	2013-09-06 17:32:47 +00:00
Andrew Trick	c573cd905a	mi-sched: improve regpressure tracing. llvm-svn: 190180	2013-09-06 17:32:44 +00:00
Andrew Trick	7609b7d1b5	mi-sched: print tree size in -view-misched-dags llvm-svn: 190179	2013-09-06 17:32:42 +00:00
Andrew Trick	ffdbefb90c	mi-sched: register pressure update tracing. llvm-svn: 190178	2013-09-06 17:32:39 +00:00
Andrew Trick	ddffae9027	mi-sched: Reorder Cyclicpath (latency) and CriticalMax (pressure) heuristics. The latency based scheduling could induce spills in some cases. llvm-svn: 190177	2013-09-06 17:32:36 +00:00

1 2 3 4 5

216 Commits