llvm-project/clang/lib/CodeGen
Chris Lattner fe34c1d53e Kill off the 'coerce' ABI passing form. Now 'direct' and 'extend' always
have a "coerce to" type which often matches the default lowering of Clang
type to LLVM IR type, but the coerce case can be handled by making them
not be the same.

This simplifies things and fixes issues where X86-64 abi lowering would 
return coerce after making preferred types exactly match up.  This caused
us to compile:

typedef float v4f32 __attribute__((__vector_size__(16)));
v4f32 foo(v4f32 X) {
  return X+X;
}

into this code at -O0:

define <4 x float> @foo(<4 x float> %X.coerce) nounwind {
entry:
  %retval = alloca <4 x float>, align 16          ; <<4 x float>*> [#uses=2]
  %coerce = alloca <4 x float>, align 16          ; <<4 x float>*> [#uses=2]
  %X.addr = alloca <4 x float>, align 16          ; <<4 x float>*> [#uses=3]
  store <4 x float> %X.coerce, <4 x float>* %coerce
  %X = load <4 x float>* %coerce                  ; <<4 x float>> [#uses=1]
  store <4 x float> %X, <4 x float>* %X.addr
  %tmp = load <4 x float>* %X.addr                ; <<4 x float>> [#uses=1]
  %tmp1 = load <4 x float>* %X.addr               ; <<4 x float>> [#uses=1]
  %add = fadd <4 x float> %tmp, %tmp1             ; <<4 x float>> [#uses=1]
  store <4 x float> %add, <4 x float>* %retval
  %0 = load <4 x float>* %retval                  ; <<4 x float>> [#uses=1]
  ret <4 x float> %0
}

Now we get:

define <4 x float> @foo(<4 x float> %X) nounwind {
entry:
  %X.addr = alloca <4 x float>, align 16          ; <<4 x float>*> [#uses=3]
  store <4 x float> %X, <4 x float>* %X.addr
  %tmp = load <4 x float>* %X.addr                ; <<4 x float>> [#uses=1]
  %tmp1 = load <4 x float>* %X.addr               ; <<4 x float>> [#uses=1]
  %add = fadd <4 x float> %tmp, %tmp1             ; <<4 x float>> [#uses=1]
  ret <4 x float> %add
}

This implements rdar://8248065

llvm-svn: 109733
2010-07-29 06:26:06 +00:00
..
ABIInfo.h Kill off the 'coerce' ABI passing form. Now 'direct' and 'extend' always 2010-07-29 06:26:06 +00:00
BackendUtil.cpp Driver/IRgen: Add support for -momit-leaf-frame-pointer. 2010-07-01 01:31:45 +00:00
CGBlocks.cpp Fix flags in global block descriptor when 2010-07-28 19:07:18 +00:00
CGBlocks.h IRgen: Move blocks runtime interfaces to CodeGenModule. 2010-07-16 00:00:15 +00:00
CGBuilder.h Make CGBuilderTy a typedef again; its functionality has been rolled back 2010-07-06 18:43:48 +00:00
CGBuiltin.cpp 2nd argument of __builtin_expect must be evaluated 2010-07-26 23:11:03 +00:00
CGCXX.cpp Introduce Decl::hasBody() and FunctionDecl::hasBody() and use them instead of getBody() when we are just checking the existence of a body, to avoid de-serialization of the body from PCH. 2010-07-07 11:31:19 +00:00
CGCXX.h Remove tabs, and whitespace cleanups. 2009-09-09 15:08:12 +00:00
CGCXXABI.h Add a stub Microsoft Visual C++ ABI class (with stub mangler). 2010-06-09 23:25:41 +00:00
CGCall.cpp Kill off the 'coerce' ABI passing form. Now 'direct' and 'extend' always 2010-07-29 06:26:06 +00:00
CGCall.h relax the CGFunctionInfo::CGFunctionInfo ctor to allow any sequence 2010-06-29 18:13:52 +00:00
CGClass.cpp Rename LazyCleanup -> Cleanup. No functionality change for these last three 2010-07-21 07:22:38 +00:00
CGDebugInfo.cpp Override selected builtin names (e.g. "long int" instead of "long") to match names used by gcc in debug info. This makes gdb testsuite happy. 2010-07-28 23:23:29 +00:00
CGDebugInfo.h Always use current working directory for DW_AT_comp_dir. 2010-07-27 20:49:59 +00:00
CGDecl.cpp Turn off EH cleanups for __block variables; they caused some internal buildbot 2010-07-22 21:25:44 +00:00
CGDeclCXX.cpp Rename LazyCleanup -> Cleanup. No functionality change for these last three 2010-07-21 07:22:38 +00:00
CGException.cpp Support catching Objective C pointers in C++ under the non-fragile NeXT runtime. 2010-07-24 00:37:23 +00:00
CGException.h Revise cleanup IR generation to fix a major bug with cleanups (PR7686) 2010-07-23 21:56:41 +00:00
CGExpr.cpp Switch the destructor for a temporary arising from a reference binding over to 2010-07-21 06:29:51 +00:00
CGExprAgg.cpp Introduce a new cast kind for an "lvalue bitcast", which handles 2010-07-13 23:17:26 +00:00
CGExprCXX.cpp Implement zero-initialization for array new when there is an 2010-07-21 01:10:17 +00:00
CGExprComplex.cpp Add lvalue-bitcast support for complex numbers. 2010-07-14 21:35:45 +00:00
CGExprConstant.cpp Fix crash initializing a bit-field with a non-constant in a place where we 2010-07-17 23:55:01 +00:00
CGExprScalar.cpp Vectors are not integer types, so the type system should not classify 2010-07-23 15:58:24 +00:00
CGObjC.cpp Return type of a setter call caused by 2010-07-24 00:34:08 +00:00
CGObjCGNU.cpp Support catching Objective C pointers in C++ under the non-fragile NeXT runtime. 2010-07-24 00:37:23 +00:00
CGObjCMac.cpp Support catching Objective C pointers in C++ under the non-fragile NeXT runtime. 2010-07-24 00:37:23 +00:00
CGObjCRuntime.h Support catching Objective C pointers in C++ under the non-fragile NeXT runtime. 2010-07-24 00:37:23 +00:00
CGRTTI.cpp getBody() -> hasBody() 2010-07-07 12:24:18 +00:00
CGRecordLayout.h Keep track of the LLVM field numbers for non-virtual bases. 2010-05-18 05:22:06 +00:00
CGRecordLayoutBuilder.cpp AST: Rename PragmaPackAttr to MaxFieldAlignmentAttr, which is more accurate. 2010-05-27 01:12:46 +00:00
CGStmt.cpp Revise cleanup IR generation to fix a major bug with cleanups (PR7686) 2010-07-23 21:56:41 +00:00
CGTemporaries.cpp Rename LazyCleanup -> Cleanup. No functionality change for these last three 2010-07-21 07:22:38 +00:00
CGVTT.cpp The global variable for the VTT might not have external linkage; allow 2010-05-06 22:18:21 +00:00
CGVTables.cpp Remove the vast majority of the Destroy methods from the AST library, 2010-07-25 18:17:45 +00:00
CGVTables.h Introduce Decl::hasBody() and FunctionDecl::hasBody() and use them instead of getBody() when we are just checking the existence of a body, to avoid de-serialization of the body from PCH. 2010-07-07 11:31:19 +00:00
CGValue.h Adopt objc_assign_threadlocal() for __thread variables of GC types. 2010-07-20 20:30:03 +00:00
CMakeLists.txt Update CMake build for new attribute changes. 2010-06-17 00:37:02 +00:00
CodeGenAction.cpp Break Frontend's dependency on Rewrite, Checker and CodeGen in shared library configuration 2010-06-15 17:48:49 +00:00
CodeGenFunction.cpp When creating a jump destination, its scope should be the scope of the 2010-07-28 01:07:35 +00:00
CodeGenFunction.h When creating a jump destination, its scope should be the scope of the 2010-07-28 01:07:35 +00:00
CodeGenModule.cpp we are not supposed to create an improper callsite using a CallInstr; leave a fixme mentioning the simplification when CallSite can clone itself 2010-07-28 09:19:33 +00:00
CodeGenModule.h cave in to reality and make ABIInfo depend on CodeGenTypes. 2010-07-29 02:01:43 +00:00
CodeGenTypes.cpp fix rdar://8147692 - yet another crash due to my abi work. 2010-07-01 06:20:47 +00:00
CodeGenTypes.h Kill off the 'coerce' ABI passing form. Now 'direct' and 'extend' always 2010-07-29 06:26:06 +00:00
GlobalDecl.h Add GlobalDecl::getCanonicalDecl. 2010-06-22 16:00:14 +00:00
ItaniumCXXABI.cpp IRgen: Add a stub class for generating ABI-specific C++ code. 2010-05-25 19:52:27 +00:00
Makefile BUILD_ARCHIVE is the default for libraries, no need to set it. 2010-07-18 00:14:47 +00:00
Mangle.cpp Mangle enum constant expressions. Fixes rdar://problem/8204122 2010-07-24 01:17:35 +00:00
Mangle.h Add function for mangling reference temporaries. 2010-06-26 16:09:40 +00:00
MicrosoftCXXABI.cpp Mangle Objective-C pointers and block pointers in the Microsoft C++ Mangler. 2010-07-03 16:56:59 +00:00
ModuleBuilder.cpp Move CodeGenOptions.h *back* into Frontend. This should have been done when the 2010-06-15 23:19:56 +00:00
README.txt These IRgen improvements have been done. 2009-07-23 03:03:07 +00:00
TargetInfo.cpp Kill off the 'coerce' ABI passing form. Now 'direct' and 'extend' always 2010-07-29 06:26:06 +00:00
TargetInfo.h Fix the IR generation for catching pointers by references. 2010-07-20 22:17:55 +00:00

README.txt

IRgen optimization opportunities.

//===---------------------------------------------------------------------===//

The common pattern of
--
short x; // or char, etc
(x == 10)
--
generates an zext/sext of x which can easily be avoided.

//===---------------------------------------------------------------------===//

Bitfields accesses can be shifted to simplify masking and sign
extension. For example, if the bitfield width is 8 and it is
appropriately aligned then is is a lot shorter to just load the char
directly.

//===---------------------------------------------------------------------===//

It may be worth avoiding creation of alloca's for formal arguments
for the common situation where the argument is never written to or has
its address taken. The idea would be to begin generating code by using
the argument directly and if its address is taken or it is stored to
then generate the alloca and patch up the existing code.

In theory, the same optimization could be a win for block local
variables as long as the declaration dominates all statements in the
block.

NOTE: The main case we care about this for is for -O0 -g compile time
performance, and in that scenario we will need to emit the alloca
anyway currently to emit proper debug info. So this is blocked by
being able to emit debug information which refers to an LLVM
temporary, not an alloca.

//===---------------------------------------------------------------------===//

We should try and avoid generating basic blocks which only contain
jumps. At -O0, this penalizes us all the way from IRgen (malloc &
instruction overhead), all the way down through code generation and
assembly time.

On 176.gcc:expr.ll, it looks like over 12% of basic blocks are just
direct branches!

//===---------------------------------------------------------------------===//