Summary:
Original change was D43991 (rL326541) and was reverted by rL326571 and
rL326572. This adds also the necessary MCCodeEmitter patch.
Reviewers: sbc100
Subscribers: jfb, dschuff, sbc100, jgravelle-google, sunfish, llvm-commits, ncw
Differential Revision: https://reviews.llvm.org/D44034
llvm-svn: 326614
The byte-swapping loads and stores do not actually perform multiple
accesses to their memory operand, so they are OK to use with volatile
memory operands as well. Remove overly cautious check.
llvm-svn: 326613
This adds back-end support for the anyregcc calling convention
for use with patchpoints.
Since all registers are considered call-saved with anyregcc
(except for 0 and 1 which may still be clobbered by PLT stubs
and the like), this required adding support for saving and
restoring vector registers in prologue/epilogue code for the
first time. This is not used by any other calling convention.
llvm-svn: 326612
On SystemZ we need to provide a register save area of 160 bytes to
any called function. This size needs to be added when allocating
stack in the function prologue. However, it was not accounted for
as part of MachineFrameInfo::getStackSize(); instead the back-end
used a private routine getAllocatedStackSize().
This is OK for code-gen purposes, but it breaks other users of
the getStackSize() routine, in particular it breaks the recently-
added -stack-size-section feature.
Fix this by updating the main stack size tracked by common code
(in emitPrologue) instead of using the private routine.
No change in code generation intended.
llvm-svn: 326610
This adds support for specifying vector registers for use with inline
asm statements, either via the 'v' constraint or by explicit register
names (v0 ... v31).
llvm-svn: 326609
The code was checking that all of the instructions in the
sequence are 'fast', but that's not necessary. The final
multiply is all that we need to check (tests adjusted).
The fmul doesn't need to be fully 'fast' either, but that
can be another patch.
llvm-svn: 326608
This narrow fold was added with no motivation or test cases
a bit over 5 years ago. Removing a constant operand is a
good canonicalization? We should handle Y*2.0 too then?
llvm-svn: 326606
The test case previously triggered an assertion in the AST matchers because the QualType being matched is invalid. That is no longer the case after r326604.
llvm-svn: 326605
There's not a particularly good way to test this with the AST matchers unit tests because the only way to get an invalid type (that I can devise) involves creating parse errors, which the test harness always treats as a failure. Instead, a clang-tidy test case will be added in a follow-up commit based on the original bug report.
llvm-svn: 326604
The patch fixes a number of bugs related to parameter indexing in
attributes:
* Parameter indices in some attributes (argument_with_type_tag,
pointer_with_type_tag, nonnull, ownership_takes, ownership_holds,
and ownership_returns) are specified in source as one-origin
including any C++ implicit this parameter, were stored as
zero-origin excluding any this parameter, and were erroneously
printing (-ast-print) and confusingly dumping (-ast-dump) as the
stored values.
* For alloc_size, the C++ implicit this parameter was not subtracted
correctly in Sema, leading to assert failures or to silent failures
of __builtin_object_size to compute a value.
* For argument_with_type_tag, pointer_with_type_tag, and
ownership_returns, the C++ implicit this parameter was not added
back to parameter indices in some diagnostics.
This patch fixes the above bugs and aims to prevent similar bugs in
the future by introducing careful mechanisms for handling parameter
indices in attributes. ParamIdx stores a parameter index and is
designed to hide the stored encoding while providing accessors that
require each use (such as printing) to make explicit the encoding that
is needed. Attribute declarations declare parameter index arguments
as [Variadic]ParamIdxArgument, which are exposed as ParamIdx[*]. This
patch rewrites all attribute arguments that are processed by
checkFunctionOrMethodParameterIndex in SemaDeclAttr.cpp to be declared
as [Variadic]ParamIdxArgument. The only exception is xray_log_args's
argument, which is encoded as a count not an index.
Differential Revision: https://reviews.llvm.org/D43248
llvm-svn: 326602
This is NFC for the moment (and independent of any potential NaN semantic
controversy). Besides making the code in InstSimplify easier to read, the
motivation is to eventually allow undef elements in vector constants to
match too. A proposal to add the base logic for that is in D43792.
llvm-svn: 326600
These instructions are double-pumped, split into 2 128-bit ops and then passing through either FPU pipe.
Found while testing llvm-mca (D43951)
llvm-svn: 326597
If we are only truncating bits from the extend we should be able to just use a smaller extend.
If we are truncating more than the extend we should be able to just use a fptrunc since the presense of the fpextend shouldn't affect rounding.
Differential Revision: https://reviews.llvm.org/D43970
llvm-svn: 326595
Patch fixes the problem with the functions marked as `declare simd`. If
the canonical declaration does not have associated `declare simd`
construct, we may not generate required code even if other
redeclarations are marked as `declare simd`.
llvm-svn: 326594
In CUDA mode all local variables are actually thread
local|threadprivate, not private, and, thus, they cannot be shared
between threads|lanes.
llvm-svn: 326590
Currently when AllowRemainder is disabled, pragma unroll count is not
respected even though there is no remainder. This bug causes a loop
fully unrolled in many cases even though the user specifies a unroll
count. Especially it affects OpenCL/CUDA since in many cases a loop
contains convergent instructions and currently AllowRemainder is
disabled for such loops.
Differential Revision: https://reviews.llvm.org/D43826
llvm-svn: 326585
When an Armv6m function dynamically re-aligns the stack, access to incoming
stack arguments (and to stack area, allocated for register varargs) is done via
SP, which is incorrect, as the SP is offset by an unknown amount relative to the
value of SP upon function entry.
This patch fixes it, by making access to "fixed" frame objects be done via FP
when the function needs stack re-alignment. It also changes the access to
"fixed" frame objects be done via FP (instead of using R6/BP) also for the case
when the stack frame contains variable sized objects. This should allow more
objects to fit within the immediate offset of the load instruction.
All of the above via a small refactoring to reuse the existing
`ARMFrameLowering::ResolveFrameIndexReference.`
Differential Revision: https://reviews.llvm.org/D43566
llvm-svn: 326584
This avoids the Writer unnecessarily having a member to retain ownership
of the function body.
Differential Revision: https://reviews.llvm.org/D43933
llvm-svn: 326580
Adrian Sampson's blog post provides a good and relatively up-do-date
introduction to LLVM. I think this post could be helpful for people wanting
to get started with LLVM.
Reviewers: asb, tonic, silvas, probinson, kristof.beyls, rengolin
Reviewed By: rengolin
Differential Revision: https://reviews.llvm.org/D42904
llvm-svn: 326576
This reverts commits r326541 and r326571.
The tests were correct, and were updated with incorrect expectations.
The original commit was broken and should be reverted to get things back
to a working state.
llvm-svn: 326572
r326541 slightly increased the size of WebAssembly object files
and it broke test/MC/WebAssembly/global-ctor-dtor.ll.
This commit updates the test to unbreak it, also mentioned this to the
author of the original commit in case they don't want it.
llvm-svn: 326571
Code generation of VLD3, VLD4, VST3 and VST4 with register writeback is
broken due to 2 separate bugs:
1) VLD1d64TPseudoWB_register and VLD1d64QPseudoWB_register are missing
rules to expand them to non pseudo MIR. These are selected for
ARMISD::VLD3_UPD/VLD4_UPD with v1i64 vectors in SelectVLD.
2) Selection of the right VLD/VST instruction is broken for load and
store of 3 and 4 v1i64 vectors. SelectVLD and SelectVST are called
with MIR opcode for fixed writeback (ie increment is access size)
and call getVLDSTRegisterUpdateOpcode() to select an opcode with
register writeback if base register update is of a different size.
Since getVLDSTRegisterUpdateOpcode() only knows about
VLD1/VLD2/VST1/VST2 the call is currently conditional on the number
of element in the vector.
However, VLD1/VST1 is selected by SelectVLD/SelectVST's caller for
load and stores of 3 or 4 v1i64 vectors. Therefore the opcode is not
updated which later lead to a fixed writeback instruction being
constructed with an extra operand for the register writeback.
This patch addresses the two issues as follows:
- it adds the necessary mapping from VLD1d64TPseudoWB_register and
VLD1d64QPseudoWB_register to VLD1d64Twb_register and
VLD1d64Qwb_register respectively. Like for the existing _fixed
variants, the cost of these is bumped for unaligned access.
- it changes the logic in SelectVLD and SelectVSD to call isVLDfixed
and isVSTfixed respectively to decide whether the opcode should be
updated. It also reworks the logic and comments for pushing the
writeback offset operand and r0 operand to clarify the logic:
writeback offset needs to be pushed if it's a register writeback,
r0 needs to be pushed if not and the instruction is a
VLD1/VLD2/VST1/VST2.
Reviewers: rengolin, t.p.northover, samparker
Reviewed By: samparker
Patch by Thomas Preud'homme <thomas.preudhomme@arm.com>
Differential Revision: https://reviews.llvm.org/D42970
llvm-svn: 326570
Summary:
Found by asan. Fiddling with code completion AST after
FrontendAction::Exceute can lead to errors.
Calling the callback in ProcessCodeCompleteResults to make sure we
don't access uninitialized state.
This particular issue comes from the fact that Sema::TUScope is
deleted when destructor of ~Parser runs, but still present in
Sema::TUScope and accessed when building completion items.
I'm still struggling to come up with a small repro. The relevant
stackframes reported by asan are:
ERROR: AddressSanitizer: heap-use-after-free on address
READ of size 8 at 0x61400020d090 thread T175
#0 0x5632dff7821b in llvm::SmallPtrSetImplBase::isSmall() const include/llvm/ADT/SmallPtrSet.h:195:33
#1 0x5632e0335901 in llvm::SmallPtrSetImplBase::insert_imp(void const*) include/llvm/ADT/SmallPtrSet.h:127:9
#2 0x5632e067347d in llvm::SmallPtrSetImpl<clang::Decl*>::insert(clang::Decl*) include/llvm/ADT/SmallPtrSet.h:372:14
#3 0x5632e065df80 in clang::Scope::AddDecl(clang::Decl*) tools/clang/include/clang/Sema/Scope.h:287:18
#4 0x5632e0623eea in clang::ASTReader::pushExternalDeclIntoScope(clang::NamedDecl*, clang::DeclarationName) clang/lib/Serialization/ASTReader.cpp
#5 0x5632e062ce74 in clang::ASTReader::finishPendingActions() tools/clang/lib/Serialization/ASTReader.cpp:9164:9
....
#30 0x5632e02009c4 in clang::index::generateUSRForDecl(clang::Decl const*, llvm::SmallVectorImpl<char>&) tools/clang/lib/Index/USRGeneration.cpp:1037:6
#31 0x5632dff73eab in clang::clangd::(anonymous namespace)::getSymbolID(clang::CodeCompletionResult const&) tools/clang/tools/extra/clangd/CodeComplete.cpp:326:20
#32 0x5632dff6fe91 in clang::clangd::CodeCompleteFlow::mergeResults(std::vector<clang::CodeCompletionResult, std::allocator<clang::CodeCompletionResult> > const&, clang::clangd::SymbolSlab const&)::'lambda'(clang::CodeCompletionResult const&)::operator()(clang::CodeCompletionResult const&) tools/clang/tools/extra/clangd/CodeComplete.cpp:938:24
#33 0x5632dff6e426 in clang::clangd::CodeCompleteFlow::mergeResults(std::vector<clang::CodeCompletionResult, std::allocator<clang::CodeCompletionResult> > const&, clang::clangd::SymbolSlab const&) third_party/llvm/llvm/tools/clang/tools/extra/clangd/CodeComplete.cpp:949:38
#34 0x5632dff7a34d in clang::clangd::CodeCompleteFlow::runWithSema() llvm/tools/clang/tools/extra/clangd/CodeComplete.cpp:894:16
#35 0x5632dff6df6a in clang::clangd::CodeCompleteFlow::run(clang::clangd::(anonymous namespace)::SemaCompleteInput const&) &&::'lambda'()::operator()() const third_party/llvm/llvm/tools/clang/tools/extra/clangd/CodeComplete.cpp:858:35
#36 0x5632dff6cd42 in clang::clangd::(anonymous namespace)::semaCodeComplete(std::unique_ptr<clang::CodeCompleteConsumer, std::default_delete<clang::CodeCompleteConsumer> >, clang::CodeCompleteOptions const&, clang::clangd::(anonymous namespace)::SemaCompleteInput const&, llvm::function_ref<void ()>) tools/clang/tools/extra/clangd/CodeComplete.cpp:735:5
0x61400020d090 is located 80 bytes inside of 432-byte region [0x61400020d040,0x61400020d1f0)
freed by thread T175 here:
#0 0x5632df74e115 in operator delete(void*, unsigned long) projects/compiler-rt/lib/asan/asan_new_delete.cc:161:3
#1 0x5632e0b06973 in clang::Parser::~Parser() tools/clang/lib/Parse/Parser.cpp:410:3
#2 0x5632e0b06ddd in clang::Parser::~Parser() clang/lib/Parse/Parser.cpp:408:19
#3 0x5632e0b03286 in std::unique_ptr<clang::Parser, std::default_delete<clang::Parser> >::~unique_ptr() .../bits/unique_ptr.h:236:4
#4 0x5632e0b021c4 in clang::ParseAST(clang::Sema&, bool, bool) tools/clang/lib/Parse/ParseAST.cpp:182:1
#5 0x5632e0726544 in clang::FrontendAction::Execute() tools/clang/lib/Frontend/FrontendAction.cpp:904:8
#6 0x5632dff6cd05 in clang::clangd::(anonymous namespace)::semaCodeComplete(std::unique_ptr<clang::CodeCompleteConsumer, std::default_delete<clang::CodeCompleteConsumer> >, clang::CodeCompleteOptions const&, clang::clangd::(anonymous namespace)::SemaCompleteInput const&, llvm::function_ref<void ()>) tools/clang/tools/extra/clangd/CodeComplete.cpp:728:15
Reviewers: sammccall
Reviewed By: sammccall
Subscribers: klimek, jkorous-apple, cfe-commits, ioeric
Differential Revision: https://reviews.llvm.org/D44000
llvm-svn: 326569
This patch adds support for detecting outer loops with irreducible control
flow in LV. Current detection uses SCCs and only works for innermost loops.
This patch adds a utility function that works on any CFG, given its RPO
traversal and its LoopInfoBase. This function is a generalization
of isIrreducibleCFG from lib/CodeGen/ShrinkWrap.cpp. The code in
lib/CodeGen/ShrinkWrap.cpp is also updated to use the new generic utility
function.
Patch by Diego Caballero <diego.caballero@intel.com>
Differential Revision: https://reviews.llvm.org/D40874
llvm-svn: 326568