As proposed in D97109, I tried to make target creation consistent in `clang` and `clangd` by replacing the original procedure with a single function introduced in D97493.
This also helps `clangd` works with CUDA, OpenMP, etc.
Reviewed By: kadircet
Differential Revision: https://reviews.llvm.org/D98128
Programmers would like to be able to test direct methods by calling them from a
different linkage unit or mocking them, both of which are impossible. This
patch adds a flag that effectively disables the attribute, which will fix this
when enabled in testable builds. rdar://71190891
Differential revision: https://reviews.llvm.org/D95845
The key change (4f5e92c) to switch gc.result and gc.relocate to being readnone landed nearly two weeks ago, and we haven't seen any fallout. Time to remove the code added to make reverting easy.
This fixes an oversight in D99747 which moved the IMG init code from
SIAddIMGInit to AdjustInstrPostInstrSelection, but did not set the
hasPostISelHook flag on gather4 instructions.
Differential Revision: https://reviews.llvm.org/D99953
Previously we could only vectorize FP reductions if fast math was enabled, as this allows us to
reorder FP operations. However, it may still be beneficial to vectorize the loop by moving
the reduction inside the vectorized loop and making sure that the scalar reduction value
be an input to the horizontal reduction, e.g:
%phi = phi float [ 0.0, %entry ], [ %reduction, %vector_body ]
%load = load <8 x float>
%reduction = call float @llvm.vector.reduce.fadd.v8f32(float %phi, <8 x float> %load)
This patch adds a new flag (IsOrdered) to RecurrenceDescriptor and makes use of the changes added
by D75069 as much as possible, which already teaches the vectorizer about in-loop reductions.
For now in-order reduction support is off by default and controlled with the `-enable-strict-reductions` flag.
Reviewed By: david-arm
Differential Revision: https://reviews.llvm.org/D98435
The existing implementation was always creating 32-bit constants for
floating-point and integer reductions regardless of the actual type, which
resulted in invalid IR being generated for any types other than f32 and i32
when lowering affine.parallel to SCF. Use the actual type instead.
Reviewed By: chelini
Differential Revision: https://reviews.llvm.org/D99942
Problem:
On SystemZ we need to open text files in text mode. On Windows, files opened in text mode adds a CRLF '\r\n' which may not be desirable.
Solution:
This patch adds two new flags
- OF_CRLF which indicates that CRLF translation is used.
- OF_TextWithCRLF = OF_Text | OF_CRLF indicates that the file is text and uses CRLF translation.
Developers should now use either the OF_Text or OF_TextWithCRLF for text files and OF_None for binary files. If the developer doesn't want carriage returns on Windows, they should use OF_Text, if they do want carriage returns on Windows, they should use OF_TextWithCRLF.
So this is the behaviour per platform with my patch:
z/OS:
OF_None: open in binary mode
OF_Text : open in text mode
OF_TextWithCRLF: open in text mode
Windows:
OF_None: open file with no carriage return
OF_Text: open file with no carriage return
OF_TextWithCRLF: open file with carriage return
The Major change is in llvm/lib/Support/Windows/Path.inc to only set text mode if the OF_CRLF is set.
```
if (Flags & OF_CRLF)
CrtOpenFlags |= _O_TEXT;
```
These following files are the ones that still use OF_Text which I left unchanged. I modified all these except raw_ostream.cpp in recent patches so I know these were previously in Binary mode on Windows.
./llvm/lib/Support/raw_ostream.cpp
./llvm/lib/TableGen/Main.cpp
./llvm/tools/dsymutil/DwarfLinkerForBinary.cpp
./llvm/unittests/Support/Path.cpp
./clang/lib/StaticAnalyzer/Core/HTMLDiagnostics.cpp
./clang/lib/Frontend/CompilerInstance.cpp
./clang/lib/Driver/Driver.cpp
./clang/lib/Driver/ToolChains/Clang.cpp
Reviewed By: MaskRay
Differential Revision: https://reviews.llvm.org/D99426
Changes getRecurrenceIdentity to always return a neutral value of -0.0 for FAdd.
Reviewed By: dmgreen, spatel
Differential Revision: https://reviews.llvm.org/D98963
For VPWidenPHIRecipes that model all incoming values as VPValue
operands, print those operands instead of printing the original PHI.
D99294 updates recipes of reduction PHIs to use the VPValue for the
incoming value from the loop backedge, making use of this new printing.
Remove the find_package(Python3 ...) call from Tooling/CMakeLists.txt as
it would override the python 3 version determined in llvm/CMakeLists.txt.
This call did not respect the LLVM_MINIMUM_PYTHON_VERSION.
This fixes the check-all target when building LLVM on a system where the
default python version is not the minimum required version for running tests.
Reviewed By: serge-sans-paille
Differential Revision: https://reviews.llvm.org/D99715
After rG47321c311bdbe0145b9bf45d822185c37b19fa50 we promote vXi8 reductions to vXi16 to create a much faster PMULLW mul reduction, followed by a (free) truncation. This avoids the high cost of repeated vXi8 multiplications (which extend+multiply+truncate to/from vXi16 types....).
Fixes the missing vXi8 mul reduction vectorization in PR42674 (Comment #20) 'mul16' test case.
1. Rename RVVBinBuiltin to RVVOutputOp1Builtin because it is not related
to the number of operand.
2. Add RVV Integer instuctions which use RVVOutputOp1Builtin.
Reviewed By: craig.topper
Authored-by: Roger Ferrer Ibanez <rofirrim@gmail.com>
Co-Authored-by: Zakk Chen <zakk.chen@sifive.com>
Differential Revision: https://reviews.llvm.org/D99524
Use a with block for reading the cpuinfo file.
When loading the file fails (or we're not on Linux)
return an empty string. Since all the callers are
going to do "x in self.getCPUInfo()".
Reviewed By: omjavaid
Differential Revision: https://reviews.llvm.org/D99729
LLVM test CodeGen/AArch64/aarch64-tbz.ll tries to check for the absence
of a sequence of instructions with several CHECK-NOT with one of those
directives using a variable defined in another. However CHECK-NOT are
checked independently so that is using a variable defined in a pattern
that should not occur in the input.
This commit removes the definition and uses of variable to check each
line independently, making the check stronger than the current one. It
also removes unnecessary regex match for labels.
Reviewed By: dmgreen
Differential Revision: https://reviews.llvm.org/D99602
This patch enhances hasAddressTaken() to ignore bitcasts as a
callee in callbase instruction. Such bitcast usage doesn't really take
the address in a useful meaningful way.
Reviewed By: rampitec
Differential Revision: https://reviews.llvm.org/D98884
It is possible that an entry in 'DestroyRetVal' lives longer
than an entry in 'LockMap' if not removed at checkDeadSymbols.
The added test case demonstrates this.
Reviewed By: NoQ
Differential Revision: https://reviews.llvm.org/D98504
It is generally beneficial to prefer "movi d0, #0" over "fmov s0, wzr" as this
is most efficient across all cores; it is recognised as a zeroing idiom. For
newer cores, fmov instructions can also be eliminated early and there is no
difference with movi, but some implementations lack this so is not true for
other/older cores. Thus this standardises on using movi as this should always
gives the same or better performance than the fmov with wzr.
Differential Revision: https://reviews.llvm.org/D99586