The last iterator of MBB should be recognized as MBB.end() not as
MBB.instr_end() which could return bundled instruction that is not iterable
with basic iterator.
Patch by Nikola Prica.
Differential Revision: https://reviews.llvm.org/D41626
llvm-svn: 322015
This patch was part of:
https://reviews.llvm.org/D41338
...but we can expose the bug in IR via constant propagation
as shown in the test. Unless the triple includes 'linux', we
should not fold these because the functions don't exist on
other platforms (yet?).
llvm-svn: 322010
The new test fails on the Hexagon bot. Reverting while I investigate.
This reverts https://reviews.llvm.org/rL322005
This reverts commit b7e0026b4385180c378edc658ec91a39566f2942.
llvm-svn: 322008
This is an attempt of fixing PR35807.
Due to the non-standard definition of dominance in LLVM, where uses in
unreachable blocks are dominated by anything, you can have, in an
unreachable block:
%patatino = OP1 %patatino, CONSTANT
When `SimplifyInstruction` receives a PHI where an incoming value is of
the aforementioned form, in some cases, loops indefinitely.
What I propose here instead is keeping track of the incoming values
from unreachable blocks, and replacing them with undef. It fixes this
case, and it seems to be good regardless (even if we can't prove that
the value is constant, as it's coming from an unreachable block, we
can ignore it).
Differential Revision: https://reviews.llvm.org/D41812
llvm-svn: 322006
Adds option /guard:cf to clang-cl and -cfguard to cc1 to emit function IDs
of functions that have their address taken into a section named .gfids$y for
compatibility with Microsoft's Control Flow Guard feature.
Differential Revision: https://reviews.llvm.org/D40531
llvm-svn: 322005
This patch improves diagnostic for case when mapped instruction
does not contain a field listed under RowFields.
Differential Revision: https://reviews.llvm.org/D41778
llvm-svn: 322004
Summary:
Fix DanglingHandleCheck to handle the final implementation of std::string and std::string_view.
These use a conversion operator instead of a conversion constructor.
Reviewers: hokein
Subscribers: klimek, xazax.hun, cfe-commits
Differential Revision: https://reviews.llvm.org/D41779
llvm-svn: 322002
There is precedence for factorization transforms in instcombine for FP ops with fast-math.
We also have similar logic in foldSPFofSPF().
It would take more work to add this to reassociate because that's specialized for binops,
and min/max are not binops (or even single instructions). Also, I don't have evidence that
larger min/max trees than this exist in real code, but if we find that's true, we might
want to reorganize where/how we do this optimization.
In the motivating example from https://bugs.llvm.org/show_bug.cgi?id=35717 , we have:
int test(int xc, int xm, int xy) {
int xk;
if (xc < xm)
xk = xc < xy ? xc : xy;
else
xk = xm < xy ? xm : xy;
return xk;
}
This patch solves that problem because we recognize more min/max patterns after rL321672
https://rise4fun.com/Alive/Qjnehttps://rise4fun.com/Alive/3yg
Differential Revision: https://reviews.llvm.org/D41603
llvm-svn: 321998
Summary:
Fixes the bug with incorrect handling of InsertValue|InsertElement
instrucions in SLP vectorizer. Currently, we may use incorrect
ExtractElement instructions as the operands of the original
InsertValue|InsertElement instructions.
Reviewers: mkuper, hfinkel, RKSimon, spatel
Subscribers: llvm-commits
Differential Revision: https://reviews.llvm.org/D41767
llvm-svn: 321994
Summary:
If the vectorized value is marked as extra reduction argument, its users
are not considered as external users. Patch fixes this.
Reviewers: mkuper, hfinkel, RKSimon, spatel
Subscribers: llvm-commits
Differential Revision: https://reviews.llvm.org/D41786
llvm-svn: 321993
Adds the -fstack-size-section flag to enable the .stack_sizes section. The flag defaults to on for the PS4 triple.
Differential Revision: https://reviews.llvm.org/D40712
llvm-svn: 321992
This patch allows `r7` to be used, regardless of its use as a frame pointer, as
a temporary register when popping `lr`, and also falls back to using a high
temporary register if, for some reason, we weren't able to find a suitable low
one.
Differential revision: https://reviews.llvm.org/D40961
Fixes https://bugs.llvm.org/show_bug.cgi?id=35481
llvm-svn: 321989
(Target)FrameLowering::determineCalleeSaves can be called multiple
times. I don't think it should have side-effects as creating stack
objects and setting global MachineFunctionInfo state as it is doing
today (in other back-ends as well).
This moves the creation of stack objects from determineCalleeSaves to
assignCalleeSavedSpillSlots.
Differential Revision: https://reviews.llvm.org/D41703
llvm-svn: 321987
Previously, in r320472, I moved the calculation of section offsets and sizes
for compressed debug sections into maybeCompress, which happens before
assignAddresses, so that the compression had the required information. However,
I failed to take account of relocations that patch such sections. This had two
effects:
1. A race condition existed when a debug section referred to a different debug
section (see PR35788).
2. References to symbols in non-debug sections would be patched incorrectly.
This is because the addresses of such symbols are not calculated until after
assignAddresses (this was a partial regression caused by r320472, but they
could still have been broken before, in the event that a custom layout was used
in a linker script).
assignAddresses does not need to know about the output section size of
non-allocatable sections, because they do not affect the value of Dot. This
means that there is no longer a reason not to support custom layout of
compressed debug sections, as far as I'm aware. These two points allow for
delaying when maybeCompress can be called, removing the need for the loop I
previously added to calculate the section size, and therefore the race
condition. Furthermore, by delaying, we fix the issues of relocations getting
incorrect symbol values, because they have now all been finalized.
llvm-svn: 321986
Previously, the COFF driver would call exit(1) from the
ErrorHandler in the case of a link error, even if
CanExitEarly=false was specified. Now it initializes
the ErrorHandler in the same way that the ELF driver does.
Patch by Andrew Kelley.
Differential Revision: https://reviews.llvm.org/D41803
llvm-svn: 321983
LLD previously used to handle dynamic lists and version scripts in the
exact same way, even though they have very different semantics for
shared libraries and subtly different semantics for executables. r315114
untangled their semantics for executables (building on previous work to
correct their semantics for shared libraries). With that change, dynamic
lists won't set the default version to VER_NDX_LOCAL, and so resetting
the version to VER_NDX_GLOBAL in scanShlibUndefined is unnecessary.
This was causing an issue because version scripts containing `local: *`
work by setting the default version to VER_NDX_LOCAL, but scanShlibUndefined
would override this default, and therefore symbols which should have
been local would end up in the dynamic symbol table, which differs from
both bfd and gold's behavior. gold silently keeps the symbol hidden in
such a scenario, whereas bfd issues an error. I prefer bfd's behavior
and plan to implement that in LLD in a follow-up (and the test case
added here will be updated accordingly).
Differential Revision: https://reviews.llvm.org/D41639
llvm-svn: 321982
These tests assumes availability of external symbols provided by the
C++ library, but those won't be available in case when the C++ library
is statically linked because lli itself doesn't need these.
This uses llvm-readobj -needed-libs to check if C++ library is linked as
shared library and exposes that information as a feature to lit.
Differential Revision: https://reviews.llvm.org/D41272
llvm-svn: 321981
Check whether we are comparing the same entity, not merely the same
declaration, and don't assume that weak declarations resolve to distinct
entities.
llvm-svn: 321976
The approach was never discussed, I wasn't able to reproduce this
non-determinism, and the original author went AWOL.
After a discussion on the ML, Philip suggested to revert this.
llvm-svn: 321974
I had removed the qualifiers around the autogenerated folding table so I could compare with the manual table, but didn't intend to commit the change.
llvm-svn: 321971
Allow SimplifyDemandedBits to use TargetLoweringOpt::computeKnownBits to look through bitcasts. This can help simplifying in some cases where bitcasts of constants generated during or after legalization can't be folded away, and thus didn't get picked up by SimplifyDemandedBits. This fixes PR34620, where a redundant pand created during legalization from lowering and lshr <16xi8> wasn't being simplified due to the presence of a bitcasted build_vector as an operand.
Committed on the behalf of @sameconrad (Sam Conrad)
Differential Revision: https://reviews.llvm.org/D41643
llvm-svn: 321969
Summary:
There are few oddities that occur due to v1i1, v8i1, v16i1 being legal without v2i1 and v4i1 being legal when we don't have VLX. Particularly during legalization of v2i32/v4i32/v2i64/v4i64 masked gather/scatter/load/store. We end up promoting the mask argument to these during type legalization and then have to widen the promoted type to v8iX/v16iX and truncate it to get the element size back down to v8i1/v16i1 to use a 512-bit operation. Since need to fill the upper bits of the mask we have to fill with 0s at the promoted type.
It would be better if we could just have the v2i1/v4i1 types as legal so they don't undergo any promotion. Then we can just widen with 0s directly in a k register. There are no real v4i1/v2i1 instructions anyway. Everything is done on a larger register anyway.
This also fixes an issue that we couldn't implement a masked vextractf32x4 from zmm to xmm properly.
We now have to support widening more compares to 512-bit to get a mask result out so new tablegen patterns got added.
I had to hack the legalizer for widening the operand of a setcc a bit so it didn't try create a setcc returning v4i32, extract from it, then try to promote it using a sign extend to v2i1. Now we create the setcc with v4i1 if the original setcc's result type is v2i1. Then extract that and don't sign extend it at all.
There's definitely room for improvement with some follow up patches.
Reviewers: RKSimon, zvi, guyblank
Reviewed By: RKSimon
Subscribers: llvm-commits
Differential Revision: https://reviews.llvm.org/D41560
llvm-svn: 321967
This field is defined as kmp_int32, so we should use neither
pointers to kmp_int64 nor 64 bit atomic instructions.
(Found while testing on a Raspberry Pi, 32 bit ARM)
Differential Revision: https://reviews.llvm.org/D41656
llvm-svn: 321964