This transformation anchors on a padding op whose result is only used as an input
to a Linalg op and pulls it out of a given number of loops.
The result is a packing of padded tailes of ops that is amortized just before
the outermost loop from which the pad operation is hoisted.
Differential revision: https://reviews.llvm.org/D95243
Update the preprocessor regression tests to use the new driver if the new driver is built (FLANG_BUILD_NEW_DRIVER=On), otherwise the tests will still run using f18.
Summary of changes:
- Introduce %flang to the regression tests, which points to the new driver if it is built or otherwise points to f18
- Update all tests in flang/test/Preprocessing/ to use %flang
Differential Revision: https://reviews.llvm.org/D94805
The included test case triggered a sign assertion on the result in
`Success()`. This was caused by the APSInt created for a bitcast
having its signedness bit inverted. The second APSInt constructor
argument is `isUnsigned`, so invert the result of
`isSignedIntegerType`.
Differential Revision: https://reviews.llvm.org/D95135
We allow insert_subvector lowering of all legal types, so don't always cast to the vXi64/vXf64 shuffle types - this is only necessary for X86ISD::SHUF128/X86ISD::VPERM2X128 patterns later.
For a function that returns InstructionCost, it is very tempting to write:
return InstructionCost::Invalid;
But that actually returns InstructionCost(1 /* int value of Invalid */))
which has a totally different meaning. By marking this constructor as
`delete`, this can no longer happen.
Added documentation for the fast builtin
function declarations with -fdeclare-opencl-builtins.
Tags: #clang
Differential Revision: https://reviews.llvm.org/D95038
This patch adds support for scalable-vector splats in DAGCombiner's
`isConstantOrConstantVector` and `ISD::matchUnaryPredicate` functions,
which enable the SelectionDAG div/rem-by-constant optimizations for
scalable vector types.
It also fixes up one case where the UDIV optimization was generating a
SETCC without first consulting the target for its preferred SETCC result
type.
Reviewed By: craig.topper
Differential Revision: https://reviews.llvm.org/D94501
The llvm-dwp tool hard-codes the target triple to x86. Instead, deduce the
target triple from the object files being read.
Reviewed By: dblaikie
Differential Revision: https://reviews.llvm.org/D93749
D95140 introduced `static constexpr StringRef TypeStr = "SectionHeaderTable";` member
of `SectionHeaderTable` with in-class initialized. BB reports the link error:
/usr/bin/ld: lib/libLLVMObjectYAML.a(ELFYAML.cpp.o): in function
`llvm::yaml::MappingTraits<std::unique_ptr<llvm::ELFYAML::Chunk, std::default_delete<llvm::ELFYAML::Chunk> > >::mapping(llvm::yaml::IO&, std::unique_ptr<llvm::ELFYAML::Chunk, std::default_delete<llvm::ELFYAML::Chunk> >&)':
ELFYAML.cpp:(.text._ZN4llvm4yaml13MappingTraitsISt10unique_ptrINS_7ELFYAML5ChunkESt14default_deleteIS4_EEE7mappingERNS0_2IOERS7_+0x58): undefined reference to `llvm::ELFYAML::SectionHeaderTable::TypeStr'
/usr/bin/ld: ELFYAML.cpp:(.text._ZN4llvm4yaml13MappingTraitsISt10unique_ptrINS_7ELFYAML5ChunkESt14default_deleteIS4_EEE7mappingERNS0_2IOERS7_+0x353):undefined reference to `llvm::ELFYAML::SectionHeaderTable::TypeStr'
/usr/bin/ld: ELFYAML.cpp:(.text._ZN4llvm4yaml13MappingTraitsISt10unique_ptrINS_7ELFYAML5ChunkESt14default_deleteIS4_EEE7mappingERNS0_2IOERS7_+0x6e5): undefined reference to `llvm::ELFYAML::SectionHeaderTable::TypeStr'
This patch adds a definition to cpp file, I guess it should fix the issue.
Follow-up on D95336. A bunch of these cases were found manually, the
rest made sense to be included to eliminate llvm-else-after-return
Clang-Tidy warnings.
This was discussed in D93678 thread.
Currently we have one special chunk - Fill.
This patch re implements the "SectionHeaderTable" key to become a special chunk too.
With that we are able to place the section header table at any location,
just like we place sections.
Differential revision: https://reviews.llvm.org/D95140
Whilst migrating/retiring some downstream testing, I came across a test
for weak undef IE and LD TLS references, but was unable to find any
equivalent in LLD's upstream testing. There does seem to be some slight
subtle differences that could be worth testing compared to LE TLS
references, in particular that IE can be relaxed to LE in this case,
hence this change.
Differential Revision: https://reviews.llvm.org/D95124
Reviewed by: grimar, MaskRay
This revision allows the base Linalg tiling pattern to optionally require padding to
a constant bounding shape.
When requested, a simple analysis is performed, similar to buffer promotion.
A temporary `linalg.simple_pad` op is added to model padding for the purpose of
connecting the dots. This will be replaced by a more fleshed out `linalg.pad_tensor`
op when it is available.
In the meantime, this temporary op serves the purpose of exhibiting the necessary
properties required from a more fleshed out pad op, to compose with transformations
properly.
Reviewed By: ftynse
Differential Revision: https://reviews.llvm.org/D95149
This adds subtarget features for AES, literal, and compare and branch
instruction fusion for different Cortex CPUs.
Patch by: Cassie Jones.
Differential Revision: https://reviews.llvm.org/D94457
This adds support for ".attribute arch" for all extensions that are
currently supported by the compiler.
Differential Revision: https://reviews.llvm.org/D94931
The test ms-lookup-template-base-classes.cpp added in d972d4c749
is failing on some builtbot that don't include x86.
This patch should fix that (following the patterns in the test directory).
Currently, empty lines and comments break alignment of assignments on consecutive
lines. This makes the AlignConsecutiveAssignments option an enum that allows controlling
whether empty lines or empty lines and comments should be ignored when aligning
assignments.
Reviewed By: MyDeveloperDay, HazardyKnusperkeks, tinloaf
Differential Revision: https://reviews.llvm.org/D93986
Currently, empty lines and comments break alignment of assignments on consecutive
lines. This makes the AlignConsecutiveAssignments option an enum that allows controlling
whether empty lines or empty lines and comments should be ignored when aligning
assignments.
Reviewed By: MyDeveloperDay, HazardyKnusperkeks, tinloaf
Differential Revision: https://reviews.llvm.org/D93986
When `Bool{F,G}Option` were introduced, they were designed after the existing `Opt{In,Out}FFlag` in that they implied `CC1Option` for the `ChangedBy` flag.
This means less typing, but can be misleading in situations when the `ResetBy` has explicit `CC1Option` and `ChangedBy` doesn't.
This patch stops implicitly putting `CC1Option` to `ChangedBy` flag.
Reviewed By: dexonsmith
Differential Revision: https://reviews.llvm.org/D95225
The prefix used to be the last (optional) argument to BoolOption. This decision was made with the expectation that only few command line options would need to pass it explicitly instead of using Bool{F,G}Option. It turns out that a considerable number of options don't conform to Bool{F,G}Option and need to provide the prefix anyways. This sometimes requires to explicitly pass `BothFlags<[]>`.
This patch makes prefix the first parameter, so it now directly precedes the spelling base string. Now 8 options dropped `BothFlags<[]>` and only two options (`pthread` and `emit_llvm_uselists`) need to pass an empty prefix.
Reviewed By: dexonsmith
Differential Revision: https://reviews.llvm.org/D95221
This patch adds patterns to teach the AArch64 backend to merge [US]MULL
instructions and adds/subs of half the size into [US]ML[AS]L where we don't use
the top half of the result.
Differential Revision: https://reviews.llvm.org/D95218
Adds the EHFrameSplitter and EHFrameEdgeFixer passes to the default JITLink
pass pipeline for ELF/x86-64, and teaches EHFrameEdgeFixer to handle some
new pointer encodings.
Together these changes enable exception handling (at least for the basic
cases that I've tested so far) for ELF/x86-64 objects loaded via JITLink.
For now, we correct the result for sqrt if iteration > 0. This doesn't make
sense as they are not strict relative.
Reviewed By: dmgreen, spatel, RKSimon
Differential Revision: https://reviews.llvm.org/D94480
* As discussed, fixes the ordering or (operands, results) -> (results, operands) in various `create` like methods.
* Fixes a syntax error in an ODS accessor method.
* Removes the linalg example in favor of a test case that exercises the same.
* Fixes FuncOp visibility to properly use None instead of the empty string and defaults it to None.
* Implements what was documented for requiring that trailing __init__ args `loc` and `ip` are keyword only.
* Adds a check to `InsertionPoint.insert` so that if attempting to insert past the terminator, an exception is raised telling you what to do instead. Previously, this would crash downstream (i.e. when trying to print the resultant module).
* Renames `_ods_build_default` -> `build_generic` and documents it.
* Removes `result` from the list of prohibited words and for single-result ops, defaults to naming the result `result`, thereby matching expectations and what is already implemented on the base class.
* This was intended to be a relatively small set of changes to be inlined with the broader support for ODS generating the most specific builder, but it spidered out once actually testing various combinations, so rolling up separately.
Differential Revision: https://reviews.llvm.org/D95320
Finishing out the support (to the best of my knowledge/based on current
testing running the whole check-lldb with a clang forcibly using
DW_AT_ranges on all DW_TAG_subprograms) for this feature.
Differential Revision: https://reviews.llvm.org/D94064
Reassociating some patterns to generate more fma instructions to
reduce register pressure.
Reviewed By: jsji
Differential Revision: https://reviews.llvm.org/D92071
The GNU token paste extension that removes the comma in , ## __VA_ARGS__
conflicts with C99/C++11's requirements when a variadic macro has no
named parameters: according to the standard, an invocation as FOO()
gives it a single empty argument, and concatenation of anything with an
empty argument is well-defined. For this reason, the GNU extension was
already disabled in C99 standard-conforming mode. It was not yet
disabled in C++11 standard-conforming mode.
The associated comment suggested that GCC keeps this extension enabled
in C90/C++03 standard-conforming mode, but it actually does not, so
rather than adding a check for C++ language version, this change simply
removes the check for C language version.
Reviewed By: rsmith
Differential Revision: https://reviews.llvm.org/D91913
Frame-base materialization may insert vector instructions before EXEC is initialised.
Fix this by moving lowering of llvm.amdgcn.init.exec later in backend.
Also remove SI_INIT_EXEC_LO pseudo as this is not necessary.
Reviewed By: ruiling
Differential Revision: https://reviews.llvm.org/D94645
The patterns that use this really want to know if the operand has at
least 32 sign/zero bits.
This increases opportunities to use W instructions when the original
source used i8/i16. Not sure how much this matters for performance,
but it makes i8/i16 code more consistent with i32.