Over the entire test-suite, this has an insignificantly negative average
performance impact, but reduces some of the worst slowdowns from the
anti-dep. change (r158294).
Largest speedups:
SingleSource/Benchmarks/Stanford/Quicksort - 28%
SingleSource/Benchmarks/Stanford/Towers - 24%
SingleSource/Benchmarks/Shootout-C++/matrix - 23%
MultiSource/Benchmarks/SciMark2-C/scimark2 - 19%
MultiSource/Benchmarks/MiBench/automotive-bitcount/automotive-bitcount - 15%
(matrix and automotive-bitcount were both in the top-5 slowdown list from the
anti-dep. change)
Largest slowdowns:
MultiSource/Benchmarks/McCat/03-testtrie/testtrie - 28%
MultiSource/Benchmarks/mediabench/gsm/toast/toast - 26%
MultiSource/Benchmarks/MiBench/automotive-susan/automotive-susan - 21%
SingleSource/Benchmarks/CoyoteBench/lpbench - 20%
MultiSource/Applications/d/make_dparser - 16%
llvm-svn: 158296
Using 'all' instead of 'critical' would be better because it would make it easier to
satisfy the bundling constraints, but, as noted in the FIXME, that is currently not
possible with the crs.
This yields an average 1% speedup over the entire test suite (on Power 7). Largest speedups:
SingleSource/Benchmarks/Shootout-C++/moments - 40%
MultiSource/Benchmarks/McCat/03-testtrie/testtrie - 28%
SingleSource/Benchmarks/BenchmarkGame/nsieve-bits - 26%
SingleSource/Benchmarks/McGill/misr - 23%
MultiSource/Applications/JM/ldecod/ldecod - 22%
Largest slowdowns:
SingleSource/Benchmarks/Shootout-C++/matrix - -29%
SingleSource/Benchmarks/Shootout-C++/ary3 - -22%
MultiSource/Benchmarks/BitBench/uuencode/uuencode - -18%
SingleSource/Benchmarks/Shootout-C++/ary - -17%
MultiSource/Benchmarks/MiBench/automotive-bitcount/automotive-bitcount - -15%
llvm-svn: 158294
We need an efficient mechanism to determine whether a defaulted default
constructor is constexpr, in order to determine whether a class is a literal
type, so keep the incrementally-built form on CXXRecordDecl. Remove the
on-demand computation of same, so that we only have one method for determining
whether a default constructor is constexpr. This doesn't affect correctness,
since default constructor lookup is much simpler than selecting a constructor
for copying or moving.
We don't need a corresponding mechanism for defaulted copy or move constructors,
since they can't affect whether a type is a literal type. Conversely, checking
whether such functions are constexpr can require non-trivial effort, so we defer
such checks until the copy or move constructor is required.
Thus we now only compute whether a copy or move constructor is constexpr on
demand, and only compute whether a default constructor is constexpr in advance.
This is unfortunate, but seems like the best solution.
llvm-svn: 158290
an explicitly-defaulted default constructor would be constexpr. This is
necessary in weird (but well-formed) cases where a class has more than one copy
or move constructor.
Cleanup of now-unused parts of CXXRecordDecl to follow.
llvm-svn: 158289
The PPC64 backend had patterns for i32 <-> i64 extensions and truncations that
would leave self-moves in the final assembly. Replacing those patterns with ones
based on the SUBREG builtins yields better-looking code.
Thanks to Jakob and Owen for their suggestions in this matter.
llvm-svn: 158283
Tail merging had been disabled on PPC because it would disturb bundling decisions
made during pre-RA scheduling on the 970 cores. Now, however, all bundling decisions
are made during post-RA scheduling, and tail merging is generally beneficial (the
average test-suite speedup is insignificantly positive).
Largest test-suite speedups:
MultiSource/Benchmarks/mediabench/gsm/toast/toast - 30%
MultiSource/Benchmarks/BitBench/uuencode/uuencode - 23%
SingleSource/Benchmarks/Shootout-C++/ary - 21%
SingleSource/Benchmarks/Stanford/Queens - 17%
Largest slowdowns:
MultiSource/Benchmarks/MiBench/security-sha/security-sha - 24%
MultiSource/Benchmarks/McCat/03-testtrie/testtrie - 22%
MultiSource/Applications/JM/ldecod/ldecod - 14%
MultiSource/Benchmarks/mediabench/g721/g721encode/encode - 9%
This is improved by using full (instead of just critical) anti-dependency breaking,
but doing so still causes miscompiles and so cannot yet be enabled by default.
llvm-svn: 158259
in the same line do not override getting a cursor for the previous declaration.
e.g:
int x, y;
@synthesize prop1, prop2;
pointing at 'x'/'prop1' would give 'y'/'prop2' because their source ranges overlap.
rdar://11361113
llvm-svn: 158258
The LiveRegMatrix represents the live range of assigned virtual
registers in a Live interval union per register unit. This is not
fundamentally different from the interference tracking in RegAllocBase
that both RABasic and RAGreedy use.
The important differences are:
- LiveRegMatrix tracks interference per register unit instead of per
physical register. This makes interference checks cheaper and
assignments slightly more expensive. For example, the ARM D7 reigster
has 24 aliases, so we would check 24 physregs before assigning to one.
With unit-based interference, we check 2 units before assigning to 2
units.
- LiveRegMatrix caches regmask interference checks. That is currently
duplicated functionality in RABasic and RAGreedy.
- LiveRegMatrix is a pass which makes it possible to insert
target-dependent passes between register allocation and rewriting.
Such passes could tweak the register assignments with interference
checking support from LiveRegMatrix.
Eventually, RABasic and RAGreedy will be switched to LiveRegMatrix.
llvm-svn: 158255
This deduplicates some code from the optimizing register allocators, and
it means that it is now possible to change the register allocators'
solutions simply by editing the VirtRegMap between the register
allocator pass and the rewriter.
llvm-svn: 158249
While this code is valid C++98, it is not valid C++11. The problem can be
reduced to:
class MDNode;
class DIType {
operator MDNode*() const {return 0;}
};
class WeakVH {
WeakVH(MDNode*) {}
};
int main() {
DIType di;
std::pair<void*, WeakVH> p(std::make_pair((void*)0, di)));
}
This was not detected by any of the bots we have because they either compile
C++98 with libstdc++ (which allows it), or C++11 with libc++ (which incorrectly
allows it). I ran into the problem when compiling with VS 2012 RC.
Thanks to Richard for explaining the issue.
llvm-svn: 158245
OK, not really. We don't want to reintroduce the old rewriter hacks.
This patch extracts virtual register rewriting as a separate pass that
runs after the register allocator. This is possible now that
CodeGen/Passes.cpp can configure the full optimizing register allocator
pipeline.
The rewriter pass uses register assignments in VirtRegMap to rewrite
virtual registers to physical registers, and it inserts kill flags based
on live intervals.
These finalization steps are the same for the optimizing register
allocators: RABasic, RAGreedy, and PBQP.
llvm-svn: 158244
The fast register allocator is not supposed to work in the optimizing
pipeline. It doesn't make sense to compute live intervals, run full copy
coalescing, and then run RAFast.
Fast register allocation in the optimizing pipeline is better done by
RABasic.
llvm-svn: 158242
than being given the pthread_mutex_t from the Mutex and locks that. That allows us to
track ownership of the Mutex better.
Used this to switch the LLDB_CONFIGURATION_DEBUG enabled assert when we can't get the
gdb-remote sequence mutex to assert when the thread that had the mutex releases it. This
is generally more useful information than saying just who failed to get it (since the
code that had it locked often had released it by the time the assert fired.)
llvm-svn: 158240
This could happen for cases like this:
- (NSArray *)getAllNames:(NSArray *)images {
NSMutableArray *results = [NSMutableArray array];
for (auto img in images) {
[results addObject:img.name];
}
return results;
}
Here the property access will fail because 'img' has type 'id', rather than,
say, NSImage.
This warning will not fire in templated code, since the 'id' could have
come from a template parameter.
llvm-svn: 158239
-%a + 42
into
42 - %a
previously we were emitting:
-(%a + 42)
This fixes the infinite loop in PR12338. The generated code is still not perfect, though.
Will work on that next
llvm-svn: 158237