Summary:
If you pass two 1024 bit vectors in IR with AVX2 on Windows 64. Both vectors will be split in four 256 bit pieces. The four pieces of the first argument will be passed indirectly using 4 gprs. The second argument will get passed via pointers in memory.
The PartOffsets stored for the second argument are all in terms of its original 1024 bit size. So the PartOffsets for each piece are 32 bytes apart. So if we consider it for copy elision we'll only load an 8 byte pointer, but we'll move the address 32 bytes. The stack object size we create for the first part is probably wrong too.
This issue was encountered by ISPC. I'm working on getting a reduce test case, but wanted to go ahead and get feedback on the fix.
Reviewers: rnk
Reviewed By: rnk
Subscribers: dbabokin, llvm-commits, hiraditya
Tags: #llvm
Differential Revision: https://reviews.llvm.org/D60801
llvm-svn: 358817
Summary:
Teach CorrelatedValuePropagation to also handle sub instructions in addition to add. Relatively simple since makeGuaranteedNoWrapRegion already understood sub instructions. Only subtle change is which range is passed as "Other" to that function, since sub isn't commutative.
Note that CorrelatedValuePropagation::processAddSub is still hidden behind a default-off flag as IndVarSimplify hasn't yet been fixed to strip the added nsw/nuw flags and causes a miscompile. (PR31181)
Reviewers: sanjoy, apilipenko, nikic
Reviewed By: nikic
Subscribers: hiraditya, jfb, jdoerfert, llvm-commits
Tags: #llvm
Differential Revision: https://reviews.llvm.org/D60036
llvm-svn: 358816
This is very minor issue. The returned section index is only used by
DWARFDebugLine as an llvm::upper_bound input and the use case shouldn't
cause any behavioral change.
llvm-svn: 358814
Fix for https://bugs.llvm.org/show_bug.cgi?id=41477. On the x32 ABI
with stack probing a dynamic alloca will result in a WIN_ALLOCA_32
with a 32-bit size. The current implementation tries to copy it into
RAX, resulting in a physreg copy error. Fix this by copying to EAX
instead. Also fix incorrect opcodes or registers used in subs.
llvm-svn: 358807
The MOVZX doesn't require an immediate to be encoded at all. Though it does use
a 2 byte opcode so its the same size as a 1 byte immediate. But it has a
separate source and dest register so can help avoid copies.
llvm-svn: 358805
There's one slight regression in here because we don't check that the immediate
already allowed movzx before the shift. I'll fix that next.
llvm-svn: 358804
Exactly the same as G_FCEIL, G_FABS, etc.
Add tests for the fp16/nofp16 behaviour, update arm64-vfloatintrinsics, etc.
Differential Revision: https://reviews.llvm.org/D60895
llvm-svn: 358799
Summary:
The DataCount section is necessary for the bulk memory operations
memory.init and data.drop to validate, but it is not recognized by
engines that do not support bulk memory, so emit the section only if
bulk-memory is enabled.
Reviewers: aheejin, dschuff, sbc100
Subscribers: jgravelle-google, sunfish, llvm-commits
Tags: #llvm
Differential Revision: https://reviews.llvm.org/D60637
llvm-svn: 358798
Moved UninitializedObjectChecker from the 'alpha.cplusplus' to the
'optin.cplusplus' package.
Differential Revision: https://reviews.llvm.org/D58573
llvm-svn: 358797
Exposed by a related bug about visibility of default arguments of nested
templates - without the correct decl context, default template
parameters of variable templates nested in classes would have incorrect
visibility computed.
llvm-svn: 358796
The code is/was already correct for the case where a parameter is a
parameter of its enclosing lexical DeclContext (functions and classes).
But for other templates (alias and variable templates) they don't create
their own scope to be members of - in those cases, they parameter should
be considered visible if any definition of the lexical decl context is
visible.
[this should cleanup the failure on the libstdc++ modules buildbot]
[this doesn't actually fix the variable template case for a
secondary/compounding reason (its lexical decl context is incorrectly
considered to be the translation unit)]
Test covers all 4 kinds of templates with default args, including a
regression test for the still broken variable template case.
Reviewers: rsmith
Differential Revision: https://reviews.llvm.org/D60892
llvm-svn: 358795
Summary:
This assumes all symbols are <4GB long, so we can store them as a 32-bit
integer. This reorders the fields so the length appears first, packing
with the other bitfield data in the base Symbol object.
This saved 70MB / 3.60% of heap allocations when linking
browser_tests.exe with no PDB. It's not much as a percentage, but worth
doing. I didn't do performance measurements, I don't think it will be
measurable in time.
Reviewers: ruiu, inglorion, amccarth, aganea
Subscribers: llvm-commits
Tags: #llvm
Differential Revision: https://reviews.llvm.org/D60297
llvm-svn: 358794
My understanding is that once BuildMI has been called we can't fallback
to SelectionDAG.
This change moves the fallback for when getRegForValue() fails for
that target of an indirect call. This was failing in -fPIC mode when
the callee is GlobalValue.
Add a test case that tickles this.
Differential Revision: https://reviews.llvm.org/D60908
llvm-svn: 358793
As I was waiting for the test suite to complete at 99% I noticed this
test taking quite a bit of time. Since it's easy to split I just went
ahead and did so.
llvm-svn: 358792
This is a follow-up to r291037+r291258, which used null debug locations
to prevent jumpy line tables.
Using line 0 locations achieves the same effect, but works better for
crash attribution because it preserves the right inline scope.
Differential Revision: https://reviews.llvm.org/D60913
llvm-svn: 358791
VK_SABS is part of the SymLoc bitfield in the variant kind which should
be compared for equality, not by checking the VK_SABS bit.
As far as I know, the existing code happened to produce the correct
results in all cases, so this is just a cleanup.
Patch by Stephen Crane.
Differential Revision: https://reviews.llvm.org/D60596
llvm-svn: 358788
Summary:
This emits labels around heapallocsite calls and S_HEAPALLOCSITE debug
info in codeview. Currently only changes FastISel, so emitting labels still
needs to be implemented in SelectionDAG.
Reviewers: hans, rnk
Subscribers: aprantl, hiraditya, cfe-commits, llvm-commits
Tags: #clang, #llvm
Differential Revision: https://reviews.llvm.org/D60800
llvm-svn: 358783
Calling `add_compiler_rt_component` sets up the component connection between runtime builds and the parent CMake configuration. Adding this call allows specifying `fuzzer` as a `LLVM_RUNTIME_DISTRIBUTION_COMPONENT`.
llvm-svn: 358780
This allows the cross compiled build targets to configure the LLVM tools and sub-projects that are part of the main build.
This is needed for generating native non llvm *-tablegen tools when cross compiling clang in the monorepo build environment.
llvm-svn: 358779
Previously, if the MSVC installation isn't detected properly, clang
will later just fail to execute link.exe.
This improves using clang in msvc mode on linux, where one intentionally
might not want to point clang to the MSVC installation itself (which
isn't executable as such), but where a link.exe named wine wrapper is
available in the path next to a cl.exe named wine wrapper.
Differential Revision: https://reviews.llvm.org/D60094
llvm-svn: 358778
This fixes the doxygen configuration to be functional again. I removed
the customer header and footer, as well as the no-longer-existent style
sheet. I also widened the scope of the documentation, from just the
public API to include the private interfaces as well.
llvm-svn: 358773
Summary:
Make the flags in LICM + MemorySSA tuning options in the old and new
pass managers.
Subscribers: mehdi_amini, jlebar, Prazek, george.burgess.iv, llvm-commits
Tags: #llvm
Differential Revision: https://reviews.llvm.org/D60490
llvm-svn: 358772