forked from OSchip/llvm-project
208987844f
This patch resumes the work of D16586. According to the AAPCS, volatile bit-fields should be accessed using containers of the widht of their declarative type. In such case: ``` struct S1 { short a : 1; } ``` should be accessed using load and stores of the width (sizeof(short)), where now the compiler does only load the minimum required width (char in this case). However, as discussed in D16586, that could overwrite non-volatile bit-fields, which conflicted with C and C++ object models by creating data race conditions that are not part of the bit-field, e.g. ``` struct S2 { short a; int b : 16; } ``` Accessing `S2.b` would also access `S2.a`. The AAPCS Release 2020Q2 (https://documentation-service.arm.com/static/5efb7fbedbdee951c1ccf186?token=) section 8.1 Data Types, page 36, "Volatile bit-fields - preserving number and width of container accesses" has been updated to avoid conflict with the C++ Memory Model. Now it reads in the note: ``` This ABI does not place any restrictions on the access widths of bit-fields where the container overlaps with a non-bit-field member or where the container overlaps with any zero length bit-field placed between two other bit-fields. This is because the C/C++ memory model defines these as being separate memory locations, which can be accessed by two threads simultaneously. For this reason, compilers must be permitted to use a narrower memory access width (including splitting the access into multiple instructions) to avoid writing to a different memory location. For example, in struct S { int a:24; char b; }; a write to a must not also write to the location occupied by b, this requires at least two memory accesses in all current Arm architectures. In the same way, in struct S { int a:24; int:0; int b:8; };, writes to a or b must not overwrite each other. ``` I've updated the patch D16586 to follow such behavior by verifying that we only change volatile bit-field access when: - it won't overlap with any other non-bit-field member - we only access memory inside the bounds of the record - avoid overlapping zero-length bit-fields. Regarding the number of memory accesses, that should be preserved, that will be implemented by D67399. Reviewed By: ostannard Differential Revision: https://reviews.llvm.org/D72932 |
||
---|---|---|
.. | ||
ABIInfo.h | ||
Address.h | ||
BackendUtil.cpp | ||
CGAtomic.cpp | ||
CGBlocks.cpp | ||
CGBlocks.h | ||
CGBuilder.h | ||
CGBuiltin.cpp | ||
CGCUDANV.cpp | ||
CGCUDARuntime.cpp | ||
CGCUDARuntime.h | ||
CGCXX.cpp | ||
CGCXXABI.cpp | ||
CGCXXABI.h | ||
CGCall.cpp | ||
CGCall.h | ||
CGClass.cpp | ||
CGCleanup.cpp | ||
CGCleanup.h | ||
CGCoroutine.cpp | ||
CGDebugInfo.cpp | ||
CGDebugInfo.h | ||
CGDecl.cpp | ||
CGDeclCXX.cpp | ||
CGException.cpp | ||
CGExpr.cpp | ||
CGExprAgg.cpp | ||
CGExprCXX.cpp | ||
CGExprComplex.cpp | ||
CGExprConstant.cpp | ||
CGExprScalar.cpp | ||
CGGPUBuiltin.cpp | ||
CGLoopInfo.cpp | ||
CGLoopInfo.h | ||
CGNonTrivialStruct.cpp | ||
CGObjC.cpp | ||
CGObjCGNU.cpp | ||
CGObjCMac.cpp | ||
CGObjCRuntime.cpp | ||
CGObjCRuntime.h | ||
CGOpenCLRuntime.cpp | ||
CGOpenCLRuntime.h | ||
CGOpenMPRuntime.cpp | ||
CGOpenMPRuntime.h | ||
CGOpenMPRuntimeAMDGCN.cpp | ||
CGOpenMPRuntimeAMDGCN.h | ||
CGOpenMPRuntimeGPU.cpp | ||
CGOpenMPRuntimeGPU.h | ||
CGOpenMPRuntimeNVPTX.cpp | ||
CGOpenMPRuntimeNVPTX.h | ||
CGRecordLayout.h | ||
CGRecordLayoutBuilder.cpp | ||
CGStmt.cpp | ||
CGStmtOpenMP.cpp | ||
CGVTT.cpp | ||
CGVTables.cpp | ||
CGVTables.h | ||
CGValue.h | ||
CMakeLists.txt | ||
CodeGenABITypes.cpp | ||
CodeGenAction.cpp | ||
CodeGenFunction.cpp | ||
CodeGenFunction.h | ||
CodeGenModule.cpp | ||
CodeGenModule.h | ||
CodeGenPGO.cpp | ||
CodeGenPGO.h | ||
CodeGenTBAA.cpp | ||
CodeGenTBAA.h | ||
CodeGenTypeCache.h | ||
CodeGenTypes.cpp | ||
CodeGenTypes.h | ||
ConstantEmitter.h | ||
ConstantInitBuilder.cpp | ||
CoverageMappingGen.cpp | ||
CoverageMappingGen.h | ||
EHScopeStack.h | ||
ItaniumCXXABI.cpp | ||
MacroPPCallbacks.cpp | ||
MacroPPCallbacks.h | ||
MicrosoftCXXABI.cpp | ||
ModuleBuilder.cpp | ||
ObjectFilePCHContainerOperations.cpp | ||
PatternInit.cpp | ||
PatternInit.h | ||
README.txt | ||
SanitizerMetadata.cpp | ||
SanitizerMetadata.h | ||
SwiftCallingConv.cpp | ||
TargetInfo.cpp | ||
TargetInfo.h | ||
VarBypassDetector.cpp | ||
VarBypassDetector.h |
README.txt
IRgen optimization opportunities. //===---------------------------------------------------------------------===// The common pattern of -- short x; // or char, etc (x == 10) -- generates an zext/sext of x which can easily be avoided. //===---------------------------------------------------------------------===// Bitfields accesses can be shifted to simplify masking and sign extension. For example, if the bitfield width is 8 and it is appropriately aligned then is is a lot shorter to just load the char directly. //===---------------------------------------------------------------------===// It may be worth avoiding creation of alloca's for formal arguments for the common situation where the argument is never written to or has its address taken. The idea would be to begin generating code by using the argument directly and if its address is taken or it is stored to then generate the alloca and patch up the existing code. In theory, the same optimization could be a win for block local variables as long as the declaration dominates all statements in the block. NOTE: The main case we care about this for is for -O0 -g compile time performance, and in that scenario we will need to emit the alloca anyway currently to emit proper debug info. So this is blocked by being able to emit debug information which refers to an LLVM temporary, not an alloca. //===---------------------------------------------------------------------===// We should try and avoid generating basic blocks which only contain jumps. At -O0, this penalizes us all the way from IRgen (malloc & instruction overhead), all the way down through code generation and assembly time. On 176.gcc:expr.ll, it looks like over 12% of basic blocks are just direct branches! //===---------------------------------------------------------------------===//