A few DW_TAG_formal_parameter's of the same function may have the same
name (e.g. variadic (template) functions) or don't have a name at all
(if the parameter isn't used inside the function body), but we still
need to be able to distinguish between them to get correct number of 'total vars'
and 'availability' metric.
Reviewed by: aprantl
Differential Revision: https://reviews.llvm.org/D73003
Here may be more than one out-of-line instance of the same function
among different CUs. All of them should be accounted for to get an accurate
total number of variables/parameters.
Reviewed by: aprantl
Differential Revision: https://reviews.llvm.org/D73002
This makes clang somewhat forward-compatible with new CUDA releases
without having to patch it for every minor release without adding
any new function.
If an unknown version is found, clang issues a warning (can be disabled
with -Wno-cuda-unknown-version) and assumes that it has detected
the latest known version. CUDA releases are usually supersets
of older ones feature-wise, so it should be sufficient to keep
released clang versions working with minor CUDA updates without
having to upgrade clang, too.
Differential Revision: https://reviews.llvm.org/D73231
Summary:
I added subsubsections for typical Clang-tidy entries in Release Notes, so now scripts are aware of this changes.
I don't have GitHub commit access, so please commit changes.
Reviewers: aaron.ballman, alexfh, hokein
Reviewed By: alexfh
Subscribers: njames93, xazax.hun, cfe-commits
Tags: #clang-tools-extra, #clang
Differential Revision: https://reviews.llvm.org/D72527
It can still be beneficial to do the optimization if the result of the compare
is used by *another* select.
Differential Revision: https://reviews.llvm.org/D73511
Performing a cast where the result is ignored caused Clang to crash when
performing codegen for the conversion:
_Complex int a;
void fn1() { (_Complex double) a; }
This patch addresses the crash by not trying to emit the scalar conversions,
causing it to be a noop. Fixes PR44624.
The only thing missing for basic llvm-symbolizer support is the ability on
lib/Object to get a wasm symbol's section ID, which allows sorting and
computation of the symbols' sizes.
Also, when the WasmAsmParser switches sections on new functions, also add the
section to the list of Dwarf sections if Dwarf is being generated for assembly;
this allows writing of simple tests.
Reviewers: sbc100, jhenderson, aardappel
Differential Revision: https://reviews.llvm.org/D73246
DW_TAG_subroutine_type is not really useful for statistics purposes, as it never
has location information. But it may contain DW_TAG_formal_parameter
children that generate number of parameters w/o location and decrease
'availability' metric significantly.
Reviewed by: djtodoro
Differential Revision: https://reviews.llvm.org/D72983
Different variables and functions might have the same name in different CU.
To calculate 'Availability' metric more accurate (i.e. to avoid getting
availability above 100%), we need to have some additional logic to
distinguish between them.
The patch introduces a DIE identifier that consists of a function/variable name
and declaration information: a filename and a line number. This allows
distinguishing different functions/variables (different means declared in
different files/lines) with the same name, keeping duplicates counted
as duplicates.
Reviewed by: aprantl, djtodoro
Differential Revision: https://reviews.llvm.org/D72797
Currently only supports simple copying, other operations to follow.
Reviewers: sbc100, alexshap, jhenderson
Differential Revision: https://reviews.llvm.org/D70930
This patch adds support for explicitly highlighting sub-expressions
shared by multiple leaf nodes. For example consider the following
code
%shared.load = tail call <8 x double> @llvm.matrix.columnwise.load.v8f64.p0f64(double* %arg1, i32 %stride, i32 2, i32 4), !dbg !10, !noalias !10
%trans = tail call <8 x double> @llvm.matrix.transpose.v8f64(<8 x double> %shared.load, i32 2, i32 4), !dbg !10
tail call void @llvm.matrix.columnwise.store.v8f64.p0f64(<8 x double> %trans, double* %arg3, i32 10, i32 4, i32 2), !dbg !10
%load.2 = tail call <30 x double> @llvm.matrix.columnwise.load.v30f64.p0f64(double* %arg3, i32 %stride, i32 2, i32 15), !dbg !10, !noalias !10
%mult = tail call <60 x double> @llvm.matrix.multiply.v60f64.v8f64.v30f64(<8 x double> %trans, <30 x double> %load.2, i32 4, i32 2, i32 15), !dbg !11
tail call void @llvm.matrix.columnwise.store.v60f64.p0f64(<60 x double> %mult, double* %arg2, i32 10, i32 4, i32 15), !dbg !11
We have two leaf nodes (the 2 stores) and the first store stores %trans
which is also used by the matrix multiply %mult. We generate separate
remarks for each leaf (stores). To denote that parts are shared, the
shared expressions are marked as shared (), with a reference to the
other remark that shares it. The operation summary also denotes the
shared operations separately.
Reviewers: anemet, Gerolf, thegameg, hfinkel, andrew.w.kaylor, LuoYuanke
Reviewed By: anemet
Differential Revision: https://reviews.llvm.org/D72526
When a thread stops, this checks depending on the platform if the top frame is
an abort stack frame. If so, it looks for an assert stack frame in the upper
frames and set it as the most relavant frame when found.
To do so, the StackFrameRecognizer class holds a "Most Relevant Frame" and a
"cooked" stop reason description. When the thread is about to stop, it checks
if the current frame is recognized, and if so, it fetches the recognized frame's
attributes and applies them.
rdar://58528686
Differential Revision: https://reviews.llvm.org/D73303
Signed-off-by: Med Ismail Bennani <medismail.bennani@gmail.com>
This CL adds clang declarations of built-in functions for AMDGPU MFMA intrinsics and instructions.
OpenCL tests for new built-ins are included.
Differential Revision: https://reviews.llvm.org/D72723
In ErrorHandler::error(), rearrange code to avoid calling exitLld with
the mutex locked. Acquire mutex lock when flushing the output streams in
exitLld.
Differential Revision: https://reviews.llvm.org/D73281
Dead instructions do not need to be sunk. Currently we try and record
the recipies for them, but there are no recipes emitted for them and
there's nothing to sink. They can be removed from SinkAfter while
marking them for recording.
Fixes PR44634.
Reviewers: rengolin, hsaito, fhahn, Ayal, gilr
Reviewed By: gilr
Differential Revision: https://reviews.llvm.org/D73423
Currently device lib path set by environment variable HIP_DEVICE_LIB_PATH
does not work due to extra "-L" added to each entry.
This patch fixes that by allowing argument name to be empty in addDirectoryList.
Differential Revision: https://reviews.llvm.org/D73299
Summary:
This CL changes multiple things to improve performance (notably on
Android).We introduce a cache class for the Secondary that is taking
care of this mechanism now.
The changes:
- change the Secondary "freelist" to an array. By keeping free secondary
blocks linked together through their headers, we were keeping a page
per block, which isn't great. Also we know touch less pages when
walking the new "freelist".
- fix an issue with the freelist getting full: if the pattern is an ever
increasing size malloc then free, the freelist would fill up and
entries would not be used. So now we empty the list if we get to many
"full" events;
- use the global release to os interval option for the secondary: it
was too costly to release all the time, particularly for pattern that
are malloc(X)/free(X)/malloc(X). Now the release will only occur
after the selected interval, when going through the deallocate path;
- allow release of the `BatchClassId` class: it is releasable, we just
have to make sure we don't mark the batches containing batches
pointers as free.
- change the default release interval to 1s for Android to match the
current Bionic allocator configuration. A patch is coming up to allow
changing it through `mallopt`.
- lower the smallest class that can be released to `PageSize/64`.
Reviewers: cferris, pcc, eugenis, morehouse, hctim
Subscribers: phosek, #sanitizers, llvm-commits
Tags: #sanitizers, #llvm
Differential Revision: https://reviews.llvm.org/D73507
Summary:
Currently clangd lit tests can't be run in isolation because we don't
set some of the config parameters. This enables running
./bin/llvm-lit ../clang-tools-extra/clangd/test/
or any other test in that subdirectory.
Reviewers: sammccall
Subscribers: mgorny, ilya-biryukov, MaskRay, jkorous, arphaman, usaxena95, cfe-commits
Tags: #clang
Differential Revision: https://reviews.llvm.org/D73538
MSVC 14.24 miscompiles some of LLVM's code, which makes at least these tests fail:
LLVM :: MC/MachO/gen-dwarf-cpp.s
LLVM :: MC/MachO/gen-dwarf-macro-cpp.s
LLVM :: MC/MachO/gen-dwarf-producer.s
LLVM :: MC/MachO/gen-dwarf.s
It seems better to diagnose that at build time. Since both the previous
and the next version have a fix, this might be good enough and we might
not need a real workaround. (We ran into this at
https://crbug.com/1045948)
If you hit this, use either a newer or an older version of MSVC,
or use clang-cl as host compiler.
Differential Revision: https://reviews.llvm.org/D73550
Summary:
This addresses issues raised in https://bugs.llvm.org/show_bug.cgi?id=44454.
There are outstanding issues with multi-line verbatim strings in C# that will be addressed in a follow-up PR.
Reviewers: krasimir, MyDeveloperDay
Reviewed By: krasimir, MyDeveloperDay
Subscribers: MyDeveloperDay
Tags: #clang-format
Differential Revision: https://reviews.llvm.org/D73492
Add the prefixed instructions pld and pstd to future CPU. These are load and
store instructions that require new operand types that are 34 bits. This patch
adds the two instructions as well as the operand types required.
Note that this patch also makes a minor change to tablegen to account for the
fact that some instructions are going to require shifts greater than 31 bits
for the new 34 bit instructions.
Differential Revision: https://reviews.llvm.org/D72574
Summary:
Currently IsControlFlowEquivalent determine if two blocks are control
flow equivalent by checking if A dominates B and B post dominates A.
There exists blocks that are control flow equivalent even if they don't
satisfy the A dominates B and B post dominates A condition.
For example,
if (cond)
A
if (cond)
B
In the PR, we determine if two blocks are control flow equivalent by
also checking if the two sets of conditions A and B depends on are
equivalent.
Reviewer: jdoerfert, Meinersbur, dmgreen, etiotto, bmahjour, fhahn,
hfinkel, kbarton
Reviewed By: fhahn
Subscribers: hiraditya, llvm-commits
Tag: LLVM
Differential Revision: https://reviews.llvm.org/D71578
The old method of adding line sequences one by one can easily go
quadratic if the sequences are not perfectly sorted. The equivalent
change in DWARF brought a considerable improvement in line table
parsing. It is not clear if the same will be the case for PDB, but this
does bring us a step closer towards removing the dangerous API.
This makes the types almost seamlessly interchangeable in C++17
codebases. Eventually we want to replace StringRef with the standard
type, but that requires C++17 being the default and a huge refactoring
job as StringRef has a lot more functionality.
This reverts commit 1b12766883 because of
breaking the mac test suite.
I'm not certain this is the cause because of a concurrent build breakage
which masked this problem, but the failure messages are related to
symbol lookup, which makes this very likely.
Summary: X86 has instructions to calculate fma and fneg at the same time. But we combine the fneg and fma only when fneg is the source operand under strict FP.
Reviewers: craig.topper, andrew.w.kaylor, uweigand, RKSimon, LiuChen3
Subscribers: LuoYuanke, llvm-commits, cfe-commits, jdoerfert, hiraditya
Tags: #llvm
Differential Revision: https://reviews.llvm.org/D72824
ELF for the ARM architecture requires linkers to provide
interworking for symbols that are of type STT_FUNC. Interworking for
other symbols must be encoded directly in the object file. LLD was always
providing interworking, regardless of the symbol type, this breaks some
programs that have branches from Thumb state targeting STT_NOTYPE symbols
that have bit 0 clear, but they are in fact internal labels in a Thumb
function. LLD treats these symbols as ARM and inserts a transition to Arm.
This fixes the problem for in range branches, R_ARM_JUMP24,
R_ARM_THM_JUMP24 and R_ARM_THM_JUMP19. This is expected to be the vast
majority of problem cases as branching to an internal label close to the
function.
There is at least one follow up patch required.
- R_ARM_CALL and R_ARM_THM_CALL may do interworking via BL/BLX
substitution.
In theory range-extension thunks can be altered to not change state when
the symbol type is not STT_FUNC. I will need to check with ld.bfd to see if
this is the case in practice.
Fixes (part of) https://github.com/ClangBuiltLinux/linux/issues/773
Differential Revision: https://reviews.llvm.org/D73474