This adds SoftenFloatRes, PromoteFloatRes and SoftPromoteHalfRes
legalizations for VECREDUCE, to fill the remaining hole in the SDAG
legalization. These legalizations simply expand the reduction and
let it be recursively legalized. For the PromoteFloatRes case at
least it is possible to do better than that, but it's pretty tricky
(because we need to consider the interaction of three different
vector legalizations and the type promotion) and probably not
really worthwhile.
I haven't added ExpandFloatRes support, as I am not familiar with
ppc_fp128.
Differential Revision: https://reviews.llvm.org/D87569
MASM structs are end-padded to have size a multiple of the smaller of the requested alignment and the size of their largest field (taken recursively, if they have a field of STRUCT type).
This matches the behavior of ml.exe and ml64.exe. Our original implementation followed the MASM 6.0 documentation, which instead specified that MASM structs were padded to a multiple of their requested alignment.
Reviewed By: thakis
Differential Revision: https://reviews.llvm.org/D87248
Add signed aliases for integral types, as well as the "DF" abbreviation for the FWORD type.
Reviewed By: thakis
Differential Revision: https://reviews.llvm.org/D87246
For selects of the type X == Y ? A : B, check if we can simplify A
by using the X == Y equality and replace the operand if that's
possible. We already try to do this in InstSimplify, but will only
fold if the result of the simplification is the same as B, in which
case the select can be dropped entirely. Here the select will be
retained, just one operand simplified.
As we are performing an actual replacement here, we don't have
problems with refinement / poison values.
Differential Revision: https://reviews.llvm.org/D87480
Similar to D87415, this folds the various float min/max opcodes
with a constant INF or -INF operand, or FLT_MAX / -FLT_MAX operand
if the ninf flag is set. Some of the folds are only possible under
nnan.
The fminnum(X, INF) with nnan and fmaxnum(X, -INF) with nnan cases
are needed to improve the VECREDUCE_FMIN/FMAX lowerings on X86,
the rest is here for the sake of completeness.
Differential Revision: https://reviews.llvm.org/D87571
This patch introduces the new .bb_addr_map section feature which allows us to emit the bits needed for mapping binary profiles to basic blocks into a separate section.
The format of the emitted data is represented as follows. It includes a header for every function:
| Address of the function | -> 8 bytes (pointer size)
| Number of basic blocks in this function (>0) | -> ULEB128
The header is followed by a BB record for every basic block. These records are ordered in the same order as MachineBasicBlocks are placed in the function. Each BB Info is structured as follows:
| Offset of the basic block relative to function begin | -> ULEB128
| Binary size of the basic block | -> ULEB128
| BB metadata | -> ULEB128 [ MBB.isReturn() OR MBB.hasTailCall() << 1 OR MBB.isEHPad() << 2 ]
The new feature will replace the existing "BB labels" functionality with -basic-block-sections=labels.
The .bb_addr_map section scrubs the specially-encoded BB symbols from the binary and makes it friendly to profilers and debuggers.
Furthermore, the new feature reduces the binary size overhead from 70% bloat to only 12%.
For more information and results please refer to the RFC: https://lists.llvm.org/pipermail/llvm-dev/2020-July/143512.html
Reviewed By: MaskRay, snehasish
Differential Revision: https://reviews.llvm.org/D85408
Change the analyzed form of type-bound assignment to match that of call
statements. Resolve the binding name to a specific subprogram when
possible by using `GetBindingResolution`. Otherwise leave it as a
type-bound procedure call.
Differential Revision: https://reviews.llvm.org/D87541
Prefer `errorOrWarn` to `fatal` for recoverable errors and graceful degradation
when --noinhibit-exec is specified.
Mention the destination symbol, otherwise the diagnostic is not really actionable.
Two errors are not tested but the patch does not intend to add the coverage.
Reviewed By: grimar
Differential Revision: https://reviews.llvm.org/D87486
This revision removes dependencies that exist between different string functions. This allows for the libc user to use a specific function X of this library without also depending on Y and Z.
Reviewed By: sivachandra
Differential Revision: https://reviews.llvm.org/D87421
Integer case values were being compared as unsigned by operator<
on evaluate::value::Integer. Change that to signed so that overlap
can be detected correctly.
Explicit CompareUnsigned and BLT are still available if unsigned
comparison is needed.
Fixes https://bugs.llvm.org/show_bug.cgi?id=47309
Differential Revision: https://reviews.llvm.org/D87595
This adds the `__swift_objc_members__` attribute to the semantic
analysis. It allows for annotating ObjC interfaces to provide Swift
semantics indicating that the types derived from this interface will be
back-bridged to Objective-C to allow interoperability with Objective-C
and Swift.
This is based on the work of the original changes in
8afaf3aad2
Differential Revision: https://reviews.llvm.org/D87395
Reviewed By: Aaron Ballman, Dmitri Gribenko
1ce82015f6 added a fix to restrict phi optimizations after phi
translations. But the current use of performedPhiTranslation only
checked whether phi translation happened for the first iterator and
missed cases where phi translations happens at subsequent
iterators/upwards defs.
This patch changes upward_defs_iteartor to take a pointer to a bool, so
we can easily ensure the final value includes all visited defs, while
still being able to conveniently use it with make_range & co.
In C++20, since P0896R4, std::ostream_iterator and std::ostreambuf_iterator
must have std::ptrdiff_t instead of void as a difference_type.
Tests by Casey Carter (thanks!).
Differential Revision: https://reviews.llvm.org/D87459
Added support to the Std dialect cast operations to do casts in vector types when feasible.
Reviewed By: ftynse
Differential Revision: https://reviews.llvm.org/D87410
Summary:
In small code model, AIX assembler could not deal with labels that
could not be reached within the [-0x8000, 0x8000) range from TOC base.
So when generating the assembly, we would need to help the assembler
by subtracting an offset from the label to keep the actual value
within [-0x8000, 0x8000).
Reviewed By: hubert.reinterpretcast, Xiangling_L
Differential Revision: https://reviews.llvm.org/D86879
As discussed in the sibling codegen functionality patch D87571,
this transform was created with D52766, but it is not correct.
The incorrect test diffs were missed during review, but the
'TODO' comment about this functionality was still in the code -
we need 'nnan' to enable this fold.
Clustering loads has caching benefits, but as far as I know there is no
advantage to clustering stores on any AMDGPU subtargets.
The disadvantage is that it tends to increase register pressure and
restricts scheduling freedom.
Differential Revision: https://reviews.llvm.org/D85530
This changes messages reported to stop using dynamic section names (use `describe()` instead).
This allows to avoid `unwrapOrError` and improves diagnostics.
Differential revision: https://reviews.llvm.org/D87503
It has following issues:
1) `getStaticSymbolName` returns `std::string`, but the code
assigns a result to `Expected<std::string>`.
2) The code uses `unwrapOrError` and never tests the error reported.
This patch fixes these issues.
Differential revision: https://reviews.llvm.org/D87507
There is some code that can be shared between GNU/LLVM styles.
Also, this fixes 2 inconsistencies related to dumping unknown note types:
1) For GNU style we printed "Unknown note type: (0x00000003)" in some cases, and
"Unknown note type (0x00000003)" (no colon) in other cases.
GNU readelf always prints `:`. This patch removes the related code
duplication and does the same.
2) For LLVM style in some cases we printed "Unknown note type (0x00000003)",
but sometimes just "Unknown (0x00000003)". The latter is the right form, which
is consistent with other unknowns that are printed in LLVM style.
Rebased on top of D87453.
Differential revision: https://reviews.llvm.org/D87454
Currently we don't test all core note types that are defined in
`getCoreNoteTypeName` in ELFDumper.cpp.
Also we don't have a test for an unknown core note type.
This patch fixes it.
Differential revision: https://reviews.llvm.org/D87453
Type converter may fail and return nullptr on unconvertible types. The function
conversion did not include a check and was attempting to use a nullptr type to
construct an LLVM function, leading to a crash. Add a check and return early.
The rest of the call stack propagates errors properly.
Fixes PR47403.
Reviewed By: mehdi_amini
Differential Revision: https://reviews.llvm.org/D87075
Instcombine limits converting phi types to simple loads and stores. This
does the same in codegenprepare, not processing phis that are not
simple.
Note that volatile loads/store ISel will happily convert between float
and int. Atomics are more likely to always be integer. This just keeps
things simple and doesn't process either.
Differential Revision: https://reviews.llvm.org/D83770
AliasAnalysis/MemoryLocation does not account for loops. Two
MemoryLocation can be must-overwrite, even if the first one writes
multiple locations in a loop.
This patch prevents removing such stores, by only considering candidates
that are known to be loop invariant, or executed in the same BB.
Currently the invariant check is quite conservative and only considers
Alloca and Alloca-like instructions and arguments as invariant base pointers.
It also considers GEPs with all constant indices and invariant bases as
invariant.
This can be improved in the future, but the current implementation has
only minor impact on the total number of stores eliminated (25903 vs
26047 for the baseline). There are some 2-10% swings for some individual
benchmarks. In roughly half of the cases, the number of stores removed
increases actually, because we skip candidates that are unlikely to be
valid candidates early.
LLVM will canonicalize conditional selectors to a different pattern than the old code that was used.
This is updating the function to match the new expected patterns and select SSAT or USAT when successful.
Tests have also been updated to use the new patterns.
Differential Review: https://reviews.llvm.org/D87379