This time with correct types for the data result from the SUB.
Original commit message:
Our normal lowering for ISD::SETCC uses X86ISD::SUB to enable
CSE unless the RHS is 0. optimizeCompareInstr called by the peephole
pass can turn subs with unused results into cmps to clean this up.
This commit makes other places that create X86ISD::CMP have the
same behavior.
Summary:
It can be useful to tune the default inline threshold without overriding other inlining thresholds (e.g. in code compiled for size).
The existing `-inline-threshold` flag overrides other thresholds, so it is insufficient in codebases where there is a mix of code compiled for size and speed.
Patch by Michael Holman <michael.holman@microsoft.com>
Reviewers: eraman, tejohnson
Reviewed By: tejohnson
Subscribers: tejohnson, mtrofin, davidxl, hiraditya, haicheng, llvm-commits
Tags: #llvm
Differential Revision: https://reviews.llvm.org/D73217
Summary:
Changes:
- Calls to consteval function are now evaluated in constant context but IR is still generated for them.
- Add diagnostic for taking address of a consteval function in non-constexpr context.
- Add diagnostic for address of consteval function accessible at runtime.
- Add tests
Reviewers: rsmith, aaron.ballman
Reviewed By: rsmith
Subscribers: mgrang, riccibruno, cfe-commits
Tags: #clang
Differential Revision: https://reviews.llvm.org/D63960
Fixes a wimpy-mode CTS failure for asin(float).
Passes non-wimpy for both float/double on RX580.
Signed-off-by: Aaron Watry <awatry@gmail.com>
Tested-by: Jan Vesely <jan.vesely@rutgers.edu>
Reviewed-by: Jan Vesely <jan.vesely@rutgers.edu>
shouldOptimizeForSize is showing up in a profile, spending around 10%
of the pass time in one function. This should probably not be so slow,
but the much cheaper attribute check should be done first anyway.
Summary: Add comments to the list of tokens that can follow the ']' at the end of a C# attribute specifier to prevent comments after attribute specifiers from being formatted as continuations.
Reviewers: MyDeveloperDay, krasimir
Reviewed By: MyDeveloperDay
Tags: #clang-format
Differential Revision: https://reviews.llvm.org/D73977
Prepare to accurately track the future denormal-fp-math attribute
changes. The way to actually set these separately is not wired in yet.
This is just a mechanical change, and mostly still assumes the input
and output mode match. This should be refined for some cases. For
example, fcanonicalize lowering should use the flushing variant if
either input or output flushing is enabled
In order to synthesize tail call frames, the stack frame list must not
be empty (otherwise, there is no "previous" frame to infer a tail call
from).
This case is hard to hit. To trigger it, we must first fail to push
`unwind_frame_sp` because we either fail to get its SymbolContext, or
given its SymbolContext the GetParentOfInlineScope call fails. This
causes m_concrete_frames_fetched to be incremented while m_frames
remains empty. Then, the next frame in the stack may fail within
SynthesizeTailCallFrames. This crash arose during a kernel debugging
session.
rdar://59147051
The usage of the Imm out argument from SelectSMRDOffset is pretty
confusing. Stop trying to reject CI immediates in the case where the
offset field can be used. It's not an illegal way to encode the
immediate, so just prefer the better encoding pattern with
AddedComplexity.
We probably don't even really need the different opcodes for the
different offset types anymore, but that will be more work to cleanup.
The SMRD non-buffer load patterns could also use a cleanup to be done
separately.
Summary: Spells out some `auto`s explicitly and adds another test for the matcher `isExpandedFromMacro`.
Reviewers: aaron.ballman
Subscribers: gribozavr, cfe-commits
Tags: #clang
Differential Revision: https://reviews.llvm.org/D73975
Summary:
Add a debug check for frequency queries for unknown blocks (typically blocks
that are created after BFI is computed but their frequencies are not
communicated to BFI.)
This is useful for detecting and debugging missed BFI updates.
This is debug build only and disabled behind a flag.
Reviewers: davidxl
Subscribers: hiraditya, llvm-commits
Tags: #llvm
Differential Revision: https://reviews.llvm.org/D73920
AMDGPU and x86 at least both have separate controls for whether
denormal results are flushed on output, and for whether denormals are
implicitly treated as 0 as an input. The current DAGCombiner use only
really cares about the input treatment of denormals.
Similar to D73680 (AArch64 BTI).
A local linkage function whose address is not taken does not need ENDBR32/ENDBR64. Placing the patch label after ENDBR32/ENDBR64 has the advantage that code does not need to differentiate whether the function has an initial ENDBR.
Also, add 32-bit tests and test that .cfi_startproc is at the function
entry. The line information has a general implementation and is tested
by AArch64/patchable-function-entry-empty.mir
Reviewed By: nickdesaulniers
Differential Revision: https://reviews.llvm.org/D73760
By subtracting 1 from Size at the beginning we can simplify the
subsequent calculations. This also saves 4 instructions on aarch64
and 9 instructions on x86_64, but seems to be perf neutral.
Differential Revision: https://reviews.llvm.org/D73936
Linux commit
1cf5b23988 (diff-289313b9fec99c6f0acfea19d9cfd949)
uses "#pragma clang attribute push (__attribute__((preserve_access_index)),
apply_to = record)"
to apply CO-RE relocations to all records including the following pattern:
#pragma clang attribute push (__attribute__((preserve_access_index)), apply_to = record)
typedef struct {
int a;
} __t;
#pragma clang attribute pop
int test(__t *arg) { return arg->a; }
The current approach to use struct type in the relocation record will
result in an anonymous struct, which make later type matching difficult
in bpf loader. In fact, current BPF backend will fail the above program
with assertion:
clang: ../lib/Target/BPF/BPFAbstractMemberAccess.cpp:796: ...
Assertion `TypeName.size()' failed.
The patch use the base lvalue type for the "base" value to annotate
preservee_{struct,union}_access_index intrinsics. In the above example,
the type will be "__t" which preserved the type name.
Differential Revision: https://reviews.llvm.org/D73900
Summary:
This patch is a step towards enabling BUILD_SHARED_LIBS=on, which
builds most libraries as DLLs instead of statically linked libraries.
The main effect of this is that incremental build times are greatly
reduced, since usually only one library need be relinked in response
to isolated code changes.
The bulk of this patch is fixing incorrect usage of cmake, where library
dependencies are listed under add_dependencies rather than under
target_link_libraries or under the LINK_LIBS tag. Correct usage should be
like this:
add_dependencies(MLIRfoo MLIRfooIncGen)
target_link_libraries(MLIRfoo MLIRlib1 MLIRlib2)
A separate issue is that in cmake, dependencies between static libraries
are automatically included in dependencies. In the above example, if MLIBlib1
depends on MLIRlib2, then it is sufficient to have only MLIRlib1 in the
target_link_libraries. When compiling with shared libraries, it is necessary
to have both MLIRlib1 and MLIRlib2 specified if MLIRfoo uses symbols from both.
Reviewers: mravishankar, antiagainst, nicolasvasilache, vchuravy, inouehrs, mehdi_amini, jdoerfert
Reviewed By: nicolasvasilache, mehdi_amini
Subscribers: Joonsoo, merge_guards_bot, jholewinski, mgorny, mehdi_amini, rriddle, jpienaar, burmako, shauheen, antiagainst, csigg, arpith-jacob, mgester, lucyrfox, herhut, aartbik, liufengdb, llvm-commits
Tags: #llvm
Differential Revision: https://reviews.llvm.org/D73653
Linux commit
1cf5b23988 (diff-289313b9fec99c6f0acfea19d9cfd949)
uses "#pragma clang attribute push (__attribute__((preserve_access_index)),
apply_to = record)"
to apply CO-RE relocations to all records including the following pattern:
#pragma clang attribute push (__attribute__((preserve_access_index)), apply_to = record)
typedef struct {
int a;
} __t;
#pragma clang attribute pop
int test(__t *arg) { return arg->a; }
The current approach to use struct/union type in the relocation record will
result in an anonymous struct, which make later type matching difficult
in bpf loader. In fact, current BPF backend will fail the above program
with assertion:
clang: ../lib/Target/BPF/BPFAbstractMemberAccess.cpp:796: ...
Assertion `TypeName.size()' failed.
clang will change to use the type of the base of the member access
which will preserve the typedef modifier for the
preserve_{struct,union}_access_index intrinsics in the above example.
Here we adjust BPF backend to accept that the debuginfo
type metadata may be 'typedef' and handle them properly.
Differential Revision: https://reviews.llvm.org/D73902
Summary:
This revision adds a matcher `isExpandedFromMacro` that determines whether a
statement is (transitively) expanded from a given macro.
Reviewers: gribozavr
Subscribers: cfe-commits
Tags: #clang
Differential Revision: https://reviews.llvm.org/D73965
Summary:
This is a fairly ugly hack - we back off several features for any variable
whose type isn't deduced, to avoid computing/caching linkage.
Better suggestions welcome.
Fixes https://github.com/clangd/clangd/issues/274
Reviewers: kadircet, kbobyrev
Subscribers: ilya-biryukov, MaskRay, jkorous, arphaman, usaxena95, cfe-commits
Tags: #clang
Differential Revision: https://reviews.llvm.org/D73960
This patch reverts part of r362750 / D62650, which stopped
LiveDebugVariables from trimming leading variable location ranges down
to only covering those instructions that are in scope. I've observed some
circumstances where the number of DBG_VALUEs in a function can be
amplified in an un-necessary way, to cover more instructions that are
out of scope, leading to very slow compile times. Trimming the range
of instructions that the variables cover solves the slow compile times.
The specific problem that r362750 tries to fix is addressed by the
assignment to RStart that I've added. Any variable location that begins
at the first instruction of a block will now be considered to begin at the
start of the block. While these sound the same, the have different
SlotIndexes, and the register allocator may shoehorn additional
instructions in between the two. The test added in the past
(wrong_debug_loc_after_regalloc.ll) still works with this modification.
live-debug-variables.ll has a range trimmed to not cover the prologue of
the function, while dbg-addr-dse.ll has a DBG_VALUE sink past one
instruction with no DebugLoc, which is expected behaviour.
Differential Revision: https://reviews.llvm.org/D73691
Summary:
After following Simon's suggestion about additional testing posted at
https://reviews.llvm.org/D73906, I found several more places that
need to be updated.
Reviewers: simon_tatham, dmgreen, ostannard, eli.friedman
Reviewed By: simon_tatham
Subscribers: merge_guards_bot, kristof.beyls, hiraditya, llvm-commits
Tags: #llvm
Differential Revision: https://reviews.llvm.org/D73963
Summary: By default it's 512K, which is way to small for clang parser to run on. There is no way to do it via platform-independent API, so it's implemented via pthreads directly in clangd/Threading.cpp.
Fixes https://github.com/clangd/clangd/issues/273
Patch by Dmitry Kozhevnikov!
Reviewers: ilya-biryukov, sammccall, arphaman
Reviewed By: ilya-biryukov, sammccall, arphaman
Subscribers: dexonsmith, umanwizard, jfb, ioeric, MaskRay, jkorous, arphaman, kadircet, cfe-commits
Tags: #clang
Differential Revision: https://reviews.llvm.org/D50993