Tail merging had been disabled on PPC because it would disturb bundling decisions
made during pre-RA scheduling on the 970 cores. Now, however, all bundling decisions
are made during post-RA scheduling, and tail merging is generally beneficial (the
average test-suite speedup is insignificantly positive).
Largest test-suite speedups:
MultiSource/Benchmarks/mediabench/gsm/toast/toast - 30%
MultiSource/Benchmarks/BitBench/uuencode/uuencode - 23%
SingleSource/Benchmarks/Shootout-C++/ary - 21%
SingleSource/Benchmarks/Stanford/Queens - 17%
Largest slowdowns:
MultiSource/Benchmarks/MiBench/security-sha/security-sha - 24%
MultiSource/Benchmarks/McCat/03-testtrie/testtrie - 22%
MultiSource/Applications/JM/ldecod/ldecod - 14%
MultiSource/Benchmarks/mediabench/g721/g721encode/encode - 9%
This is improved by using full (instead of just critical) anti-dependency breaking,
but doing so still causes miscompiles and so cannot yet be enabled by default.
llvm-svn: 158259
in the same line do not override getting a cursor for the previous declaration.
e.g:
int x, y;
@synthesize prop1, prop2;
pointing at 'x'/'prop1' would give 'y'/'prop2' because their source ranges overlap.
rdar://11361113
llvm-svn: 158258
The LiveRegMatrix represents the live range of assigned virtual
registers in a Live interval union per register unit. This is not
fundamentally different from the interference tracking in RegAllocBase
that both RABasic and RAGreedy use.
The important differences are:
- LiveRegMatrix tracks interference per register unit instead of per
physical register. This makes interference checks cheaper and
assignments slightly more expensive. For example, the ARM D7 reigster
has 24 aliases, so we would check 24 physregs before assigning to one.
With unit-based interference, we check 2 units before assigning to 2
units.
- LiveRegMatrix caches regmask interference checks. That is currently
duplicated functionality in RABasic and RAGreedy.
- LiveRegMatrix is a pass which makes it possible to insert
target-dependent passes between register allocation and rewriting.
Such passes could tweak the register assignments with interference
checking support from LiveRegMatrix.
Eventually, RABasic and RAGreedy will be switched to LiveRegMatrix.
llvm-svn: 158255
This deduplicates some code from the optimizing register allocators, and
it means that it is now possible to change the register allocators'
solutions simply by editing the VirtRegMap between the register
allocator pass and the rewriter.
llvm-svn: 158249
While this code is valid C++98, it is not valid C++11. The problem can be
reduced to:
class MDNode;
class DIType {
operator MDNode*() const {return 0;}
};
class WeakVH {
WeakVH(MDNode*) {}
};
int main() {
DIType di;
std::pair<void*, WeakVH> p(std::make_pair((void*)0, di)));
}
This was not detected by any of the bots we have because they either compile
C++98 with libstdc++ (which allows it), or C++11 with libc++ (which incorrectly
allows it). I ran into the problem when compiling with VS 2012 RC.
Thanks to Richard for explaining the issue.
llvm-svn: 158245
OK, not really. We don't want to reintroduce the old rewriter hacks.
This patch extracts virtual register rewriting as a separate pass that
runs after the register allocator. This is possible now that
CodeGen/Passes.cpp can configure the full optimizing register allocator
pipeline.
The rewriter pass uses register assignments in VirtRegMap to rewrite
virtual registers to physical registers, and it inserts kill flags based
on live intervals.
These finalization steps are the same for the optimizing register
allocators: RABasic, RAGreedy, and PBQP.
llvm-svn: 158244
The fast register allocator is not supposed to work in the optimizing
pipeline. It doesn't make sense to compute live intervals, run full copy
coalescing, and then run RAFast.
Fast register allocation in the optimizing pipeline is better done by
RABasic.
llvm-svn: 158242
than being given the pthread_mutex_t from the Mutex and locks that. That allows us to
track ownership of the Mutex better.
Used this to switch the LLDB_CONFIGURATION_DEBUG enabled assert when we can't get the
gdb-remote sequence mutex to assert when the thread that had the mutex releases it. This
is generally more useful information than saying just who failed to get it (since the
code that had it locked often had released it by the time the assert fired.)
llvm-svn: 158240
This could happen for cases like this:
- (NSArray *)getAllNames:(NSArray *)images {
NSMutableArray *results = [NSMutableArray array];
for (auto img in images) {
[results addObject:img.name];
}
return results;
}
Here the property access will fail because 'img' has type 'id', rather than,
say, NSImage.
This warning will not fire in templated code, since the 'id' could have
come from a template parameter.
llvm-svn: 158239
-%a + 42
into
42 - %a
previously we were emitting:
-(%a + 42)
This fixes the infinite loop in PR12338. The generated code is still not perfect, though.
Will work on that next
llvm-svn: 158237
- On iOS, we select the "apcs-gnu" ABI to match
what libraries expect.
- Literals are now allocated at their preferred
alignment, eliminating many alignment crashes.
llvm-svn: 158236
Execute which was never going to get run and another ExecuteRawCommandString. Took the knowledge of how
to prepare raw & parsed commands out of CommandInterpreter and put it in CommandObject where it belongs.
Also took all the cases where there were the subcommands of Multiword commands declared in the .h file for
the overall command and moved them into the .cpp file.
Made the CommandObject flags work for raw as well as parsed commands.
Made "expr" use the flags so that it requires you to be paused to run "expr".
llvm-svn: 158235
Objective-C literals conceptually always create new objects, but may be
optimized by the compiler or runtime (constant folding, singletons, etc).
Comparing addresses of these objects is relying on this optimization
behavior, which is really an implementation detail.
In the case of == and !=, offer a fixit to a call to -isEqual:, if the
method is available. This fixit is directly on the error so that it is
automatically applied.
Most of the time, this is really a newbie mistake, hence the fixit.
llvm-svn: 158230
This occurs when you have two insertions and the first one is so long that the
second fixit's column is before the first fixit ends. The edits themselves
don't actually overlap, but our command-line preview does.
llvm-svn: 158229
constexpr until we get to the end of the class definition. When that happens,
be sure to remember that the class actually does have a constexpr constructor.
This is a stopgap solution, which still doesn't cover the case of a class with
multiple copy constructors (only some of which are constexpr). We should be
performing constructor lookup when implicitly defining a constructor in order
to determine whether all constructors it invokes are constexpr.
llvm-svn: 158228
problem was that by moving instructions around inside the function, the pass
could accidentally move the iterator being used to advance over the function
too. Fix this by only processing the instruction equal to the iterator, and
leaving processing of instructions that might not be equal to the iterator
to later (later = after traversing the basic block; it could also wait until
after traversing the entire function, but this might make the sets quite big).
Original commit message:
Grab-bag of reassociate tweaks. Unify handling of dead instructions and
instructions to reoptimize. Exploit this to more systematically eliminate
dead instructions (this isn't very useful in practice but is convenient for
analysing some testcase I am working on). No need for WeakVH any more: use
an AssertingVH instead.
llvm-svn: 158226
Thanks to Jakob's help, this now causes no new test suite failures!
Over the entire test suite, this gives an average 1% speedup. The largest speedups are:
SingleSource/Benchmarks/Misc/pi - 108%
SingleSource/Benchmarks/CoyoteBench/lpbench - 54%
MultiSource/Benchmarks/Prolangs-C/unix-smail/unix-smail - 50%
SingleSource/Benchmarks/Shootout/ary3 - 32%
SingleSource/Benchmarks/Shootout-C++/matrix - 30%
The largest slowdowns are:
MultiSource/Benchmarks/mediabench/gsm/toast/toast - -30%
MultiSource/Benchmarks/Prolangs-C/bison/mybison - -25%
MultiSource/Benchmarks/BitBench/uuencode/uuencode - -22%
MultiSource/Applications/d/make_dparser - -14%
SingleSource/Benchmarks/Shootout-C++/ary - -13%
In light of these slowdowns, additional profiling work is obviously needed!
llvm-svn: 158223
Marking these classes as non-alocatable allows CTR loop generation to
work correctly with the block placement passes, etc. These register
classes are currently used only by some unused TCRETURN patterns.
In future cleanup, these will be removed.
Thanks again to Jakob for suggesting this fix to the CTR loop problem!
llvm-svn: 158221
to addition.
We should not to warn in case the malloc size argument is an
addition containing 'sizeof' operator - it is common to use the pattern
to pack values of different sizes into a buffer.
Ex:
uint8_t *buffer = (uint8_t*)malloc(dataSize + sizeof(length));
llvm-svn: 158219
The preprocessor's handling of diagnostic push/pops is stateful, so
encountering pragmas during a re-parse causes problems. HTMLRewrite
already filters out normal # directives including #pragma, so it's
clear it's not expected to be interpreting pragmas in this mode.
This fix adds a flag to Preprocessor to explicitly disable pragmas.
The "right" fix might be to separate pragma lexing from pragma
parsing so that we can throw away pragmas like we do preprocessor
directives, but right now it's important to get the fix in.
Note that this has nothing to do with the "hack" of re-using the
input preprocessor in HTMLRewrite. Even if we someday copy the
preprocessor instead of re-using it, the copy would (and should) include
the diagnostic level tables and have the same problems.
llvm-svn: 158214
Bulk move of TargetInstrInfo implementation into
TargetInstrInfoImpl. This is dirty because the code isn't part of
TargetInstrInfoImpl class, nor should it be, because the methods are
not target hooks. However, it's the current mechanism for keeping
libTarget useful outside the backend. You'll get a not-so-nice link
error if you invoke a TargetInstrInfo method that depends on CodeGen.
The TargetInstrInfoImpl class should probably be removed since it
doesn't really solve this problem.
To really fix this, we probably need separate interfaces for the
CodeGen/nonCodeGen sides of TargetInstrInfo.
llvm-svn: 158212