This patch adds the missing information to the LF_BUILDINFO record, which allows for rebuilding a .CPP without any external dependency but the .OBJ itself (other than the compiler).
Some external tools that we are using (Recode, Live++) are extracting the information to reproduce a build without any knowledge of the build system. The LF_BUILDINFO stores a full path to the compiler, the PWD (CWD at program startup), a relative or absolute path to the TU, and the full CC1 command line. The command line needs to be freestanding (not depend on any environment variables). In the same way, MSVC doesn't store the provided command-line, but an expanded version (somehow their equivalent of CC1) which is also freestanding.
For more information see PR36198 and D43002.
Differential Revision: https://reviews.llvm.org/D80833
X86 is the only user of this interface in tree. Previously the
X86 pass would loop over operands looking for one undef operand for
the pass to fix. But there could theoretically be multiple operands
to fix. So it makes more sense for the pass to do the looping and
ask the target if an operand needs to be fixed.
This allows us to subsequently configure the logger for the case when we
use a model evaluator and want to log additional outputs.
Differential Revision: https://reviews.llvm.org/D85577
A GlobalAlias is an address-taken user of its aliased function.
canRenameComdatFunc has excluded such cases.
Reviewed By: davidxl
Differential Revision: https://reviews.llvm.org/D85597
If a shuffle is referring to both the lower and upper half lanes of an unary horizontal op, then canonicalize the mask to only refer to the lower half.
Summary:
AIX assembler does not generate correct relocation when .rename
appear between tc entry label and .tc directive.
So only emit .rename after .tc/.comm or other linkage is emitted.
Reviewed By: daltenty, hubert.reinterpretcast
Differential Revision: https://reviews.llvm.org/D85317
On the frontend side, this patch recovers AIX static init implementation to
use the linkage type and function names Clang chooses for sinit related function.
On the backend side, this patch sets correct linkage and function names on aliases
created for sinit/sterm functions.
Differential Revision: https://reviews.llvm.org/D84534
Add a hidden option to the compiler to control a the PC Relative GOT indirect
linker optimization.
If this option is set to false the compiler will no loger produce the
relocations required by the linker to perform the optimization.
Reviewed By: nemanjai, NeHuang, #powerpc
Differential Revision: https://reviews.llvm.org/D85377
IT blocks with more than one instruction were performance deprecated in Armv8
but that doesn't mean we should follow that advise when optimising for size.
Differential Revision: https://reviews.llvm.org/D85638
Although the DWARF specification states that .debug_aranges entries
can't have length zero, these can occur in the wild. There's no
particular reason to enforce this part of the spec, since functionally
they have no impact. The patch removes the error and introduces a new
warning for premature terminator entries which does not stop parsing.
This is a relanding of cb3a598c87, adding the missing obj2yaml part
that was needed.
Fixes https://bugs.llvm.org/show_bug.cgi?id=46805. See also
https://reviews.llvm.org/D71932 which originally introduced the error.
Reviewed by: ikudrin, dblaikie, Higuoxing
Differential Revision: https://reviews.llvm.org/D85313
Check that we're shuffling hadd/pack ops first before altering shuffle masks.
First step towards adding extra functionality, plus it avoids costly shuffle mask manipulation if not necessary.
Although the DWARF specification states that .debug_aranges entries
can't have length zero, these can occur in the wild. There's no
particular reason to enforce this part of the spec, since functionally
they have no impact. The patch removes the error and introduces a new
warning for premature terminator entries which does not stop parsing.
Fixes https://bugs.llvm.org/show_bug.cgi?id=46805. See also
https://reviews.llvm.org/D71932 which originally introduced the error.
Reviewed by: ikudrin, dblaikie
Differential Revision: https://reviews.llvm.org/D85313
Instructions defined in the original inner loop preheader may depend on
values defined in the outer loop header, but the inner loop header will
become the entry block in the loop nest. Move the instructions from the
preheader to the outer loop header, so we do not break dominance. We
also have to check for unsafe instructions in the preheader. If there
are no unsafe instructions, all instructions should be movable.
Currently we move all instructions except the terminator and rely on
LICM to hoist out invariant instructions later.
Fixes PR45743
Values defined in the outer loop header could be used in the inner loop
latch. In that case, we need to create LCSSA phis for them, because after
interchanging they will be defined in the new inner loop and used in the
new outer loop.
This patch introduces two intrinsics: llvm.ppc.setflm and
llvm.ppc.readflm. They read from or write to FPSCR register
(floating-point status & control) which contains rounding mode and
exception status.
To ensure correctness of program, we need to prevent FP operations from
being moved across these intrinsics (mffs/mtfsf instruction), so here I
set them as scheduling boundaries. We can relax such restriction if
FPSCR is modeled well in the future.
Reviewed By: steven.zhang
Differential Revision: https://reviews.llvm.org/D84914
As noticed on D66004, scalarization of an expandload with a constant mask as a chain of irregular loads+inserts makes it tricky to optimize before lowering, resulting in difficulties in merging loads etc.
This patch instead scalarizes the expansion to a build_vector(load0, load1, undef, load2,....) style pattern and then performs a blend shuffle with the pass through vector. This allows us to more easily make use of all the build_vector combines, merging of consecutive loads etc.
Differential Revision: https://reviews.llvm.org/D85416
This also fixes the condition in the assertion in
DwarfCompileUnit::getLabelBegin() because it checked something unrelated
to the returned value.
Differential Revision: https://reviews.llvm.org/D85437
This patch adds noundef to return value and arguments of standard I/O functions.
With this patch, passing undef or poison to the functions becomes undefined
behavior in LLVM IR. Since undef/poison is lowered from operations having UB in C/C++,
passing undef to them was already UB in source.
With this patch, the functions cannot return undef or poison anymore as well.
According to C17 standard, ungetc/ungetwc/fgetpos/ftell can generate unspecified
value; 3.19.3 says unspecified value is a valid value of the relevant type,
and using unspecified value is unspecified behavior, which is not UB, so it
cannot be undef (using undef is UB when e.g. it is used at branch condition).
— The value of the file position indicator after a successful call to the ungetc function for a text stream, or the ungetwc function for any stream, until all pushed-back characters are read or discarded (7.21.7.10, 7.29.3.10).
— The details of the value stored by the fgetpos function (7.21.9.1).
— The details of the value returned by the ftell function for a text stream (7.21.9.4).
In the long run, most of the functions listed in BuildLibCalls should have noundefs; to remove redundant diffs which will anyway disappear in the future, I added noundef to a few more non-I/O functions as well.
Reviewed By: jdoerfert
Differential Revision: https://reviews.llvm.org/D85345
Fix 64-bit copy to SCC by restricting the pattern resulting
in such a copy to subtargets supporting 64-bit scalar compare,
and mapping the copy to S_CMP_LG_U64.
Before introducing the S_CSELECT pattern with explicit SCC
(0045786f14), there was no need
for handling 64-bit copy to SCC ($scc = COPY sreg_64).
The proposed handling to read only the low bits was however
based on a false premise that it is only one bit that matters,
while in fact the copy source might be a vector of booleans and
all bits need to be considered.
The practical problem of mapping the 64-bit copy to SCC is that
the natural instruction to use (S_CMP_LG_U64) is not available
on old hardware. Fix it by restricting the problematic pattern
to subtargets supporting the instruction (hasScalarCompareEq64).
Differential Revision: https://reviews.llvm.org/D85207
Making use of undef is not safe if the simplification result is not used
to replace all uses of the result. This leads to problems in NewGVN,
which does not replace all uses in the IR directly. See PR33165 for more
details.
This patch adds an option to SimplifyQuery to disable the use of undef.
Note that I've only guarded uses if isa<UndefValue>/m_Undef where
SimplifyQuery is currently available. If we agree on the general
direction, I'll update the remaining uses.
Reviewed By: nikic
Differential Revision: https://reviews.llvm.org/D84792
Add support for (if enabled) splitting cold functions into a separate section
in order to further boost locality of hot code.
Authored by: rjf (Ruijie Fang)
Reviewed by: hiraditya,rcorcs,vsk
Differential Revision: https://reviews.llvm.org/D85331
This patch was adjusted to match the most basic pattern that starts with an insertelement
(so there's no extract created here). Hopefully, that removes any concern about
interfering with other passes. Ie, the transform should almost always be profitable.
We could make an argument that this could be part of canonicalization, but we
conservatively try not to create vector ops from scalar ops in passes like instcombine.
If the transform is not profitable, the backend should be able to re-scalarize the load.
Differential Revision: https://reviews.llvm.org/D81766
Currently the SCEVExpander tries to re-use existing casts, even if they
are not exactly at the insertion point it was asked to create the cast.
To do so in some case, it creates a new cast at the insertion point and
updates all users to use the new cast.
This behavior is problematic, because it changes the IR outside of the
instructions created during the expansion. Therefore we cannot
completely undo all changes made during expansion.
This re-use should be only an extra optimization, so only using the new
cast in the expanded instructions should not be a correctness issue.
There are many cases equivalent instructions are created during
expansion.
This patch also adjusts findInsertPointAfter to skip instructions
inserted during expansion. This enables re-using existing casts without
the renaming any uses, by picking a better insertion point.
Reviewed By: efriedma, lebedev.ri
Differential Revision: https://reviews.llvm.org/D84399
This adds patterns for v16i16's vecreduce, using all the existing code
to go via an i32 VADDV/VMLAV and truncating the result.
Differential Revision: https://reviews.llvm.org/D85452
This formats some of the MVE patterns, and adds a missing
Predicates = [HasMVEInt] to some VRHADD patterns I noticed
as going through. Although I don't believe NEON would ever
use the patterns (as it would use ADDL and VSHRN instead)
they should ideally be predicated on having MVE instructions.
These aren't the canonical forms we'd get from InstCombine, but
we do have X86 tests for them. Recognizing them is pretty cheap.
While there make use of APInt:isSignedMinValue/isSignedMaxValue
instead of creating a new APInt to compare with. Also use
SelectionDAG::getAllOnesConstant helper to hide the all ones
APInt creation.
Rather than handling zlib handling manually, use find_package from CMake
to find zlib properly. Use this to normalize the LLVM_ENABLE_ZLIB,
HAVE_ZLIB, HAVE_ZLIB_H. Furthermore, require zlib if LLVM_ENABLE_ZLIB is
set to YES, which requires the distributor to explicitly select whether
zlib is enabled or not. This simplifies the CMake handling and usage in
the rest of the tooling.
This is a reland of abb0075 with all followup changes and fixes that
should address issues that were reported in PR44780.
Differential Revision: https://reviews.llvm.org/D79219
Fixes PR47040, in which an assertion was improperly triggered during
FastISel's address computation. The issue was that an `Address` set to
be relative to the FrameIndex with offset zero was incorrectly
considered to have an unset base. When the left hand side of an add
set the Address to be 0 off the FrameIndex, the right side would not
detect that the Address base had already been set and could try to set
the Address to be relative to a register instead, triggering an
assertion.
This patch fixes the issue by explicitly tracking whether an `Address`
has been set rather than interpreting an offset of zero to mean the
`Address` has not been set.
Differential Revision: https://reviews.llvm.org/D85581