2008-11-12 20:31:28 +08:00
|
|
|
IRgen optimization opportunities.
|
|
|
|
|
|
|
|
//===---------------------------------------------------------------------===//
|
|
|
|
|
|
|
|
The common pattern of
|
|
|
|
--
|
|
|
|
short x; // or char, etc
|
|
|
|
(x == 10)
|
|
|
|
--
|
|
|
|
generates an zext/sext of x which can easily be avoided.
|
|
|
|
|
|
|
|
//===---------------------------------------------------------------------===//
|
|
|
|
|
|
|
|
Bitfields accesses can be shifted to simplify masking and sign
|
|
|
|
extension. For example, if the bitfield width is 8 and it is
|
|
|
|
appropriately aligned then is is a lot shorter to just load the char
|
|
|
|
directly.
|
|
|
|
|
|
|
|
//===---------------------------------------------------------------------===//
|
2008-11-27 11:47:29 +08:00
|
|
|
|
2008-12-04 17:05:45 +08:00
|
|
|
It may be worth avoiding creation of alloca's for formal arguments
|
|
|
|
for the common situation where the argument is never written to or has
|
|
|
|
its address taken. The idea would be to begin generating code by using
|
|
|
|
the argument directly and if its address is taken or it is stored to
|
|
|
|
then generate the alloca and patch up the existing code.
|
|
|
|
|
|
|
|
In theory, the same optimization could be a win for block local
|
|
|
|
variables as long as the declaration dominates all statements in the
|
|
|
|
block.
|
|
|
|
|
2009-02-24 14:34:04 +08:00
|
|
|
NOTE: The main case we care about this for is for -O0 -g compile time
|
|
|
|
performance, and in that scenario we will need to emit the alloca
|
|
|
|
anyway currently to emit proper debug info. So this is blocked by
|
|
|
|
being able to emit debug information which refers to an LLVM
|
|
|
|
temporary, not an alloca.
|
|
|
|
|
2008-12-04 17:05:45 +08:00
|
|
|
//===---------------------------------------------------------------------===//
|
|
|
|
|
2009-02-21 03:34:45 +08:00
|
|
|
We should try and avoid generating basic blocks which only contain
|
|
|
|
jumps. At -O0, this penalizes us all the way from IRgen (malloc &
|
|
|
|
instruction overhead), all the way down through code generation and
|
|
|
|
assembly time.
|
|
|
|
|
|
|
|
On 176.gcc:expr.ll, it looks like over 12% of basic blocks are just
|
2009-02-24 14:34:04 +08:00
|
|
|
direct branches!
|
2009-02-21 03:34:45 +08:00
|
|
|
|
|
|
|
//===---------------------------------------------------------------------===//
|
|
|
|
|
|
|
|
There are some more places where we could avoid generating unreachable code. For
|
|
|
|
example:
|
|
|
|
void f0(int a) { abort(); if (a) printf("hi"); }
|
|
|
|
still generates a call to printf. This doesn't occur much in real
|
|
|
|
code, but would still be nice to clean up.
|
|
|
|
|
|
|
|
//===---------------------------------------------------------------------===//
|
2009-03-15 14:39:56 +08:00
|
|
|
|
|
|
|
Deferred generation of statics incurs some additional
|
|
|
|
overhead. Currently it is even possible to construct test cases with
|
|
|
|
O(N^2) behavior! For at least simple cases where we can tell a global
|
|
|
|
is used, it is probably not worth deferring it. This doesn't solve the
|
|
|
|
O(N^2) cases, ,though...
|
|
|
|
|
|
|
|
PR3810
|
|
|
|
|
|
|
|
//===---------------------------------------------------------------------===//
|