This patch adds the codegen support for `atomic compare capture` in clang.
Reviewed By: ABataev
Differential Revision: https://reviews.llvm.org/D120290
Clang's doxygen documentation specifies LLVM revision. Do the same for BOLT.
Reviewed By: rafauler
Differential Revision: https://reviews.llvm.org/D126912
brace-init-list containing a single element of a different scoped
enumeration type
It is rejected because it doesn't satisfy the condition that the element
has to be implicitly convertible to the underlying type of the
enumeration.
http://eel.is/c++draft/dcl.init.list#3.8
Differential Revision: https://reviews.llvm.org/D126084
Similar to complex128/complex64, float16 has no direct support
in the ctypes implementation. This fixes the issue by using a
custom F16 type to change the view in and out of MLIR code
Reviewed By: wrengr
Differential Revision: https://reviews.llvm.org/D126928
Since version 0.7 we've added:
* Initial language support for TableGen
* Tweaked syntax highlighting for PDLL
* Added a new command to view intermediate PDLL output
This commit enables providing long-form documentation more seamlessly to the LSP
by revamping decl documentation. For ODS imported constructs, we now also import
descriptions and attach them to decls when possible. For PDLL constructs, the LSP will
now try to provide documentation by parsing the comments directly above the decls
location within the source file. This commit also adds a new parser flag
`enableDocumentation` that gates the import and attachment of ODS documentation,
which is unnecessary in the normal build process (i.e. it should only be used/consumed
by tools).
Differential Revision: https://reviews.llvm.org/D124881
modernize-use-emplace only recommends going from a push_back to an
emplace_back, but does not provide a recommendation when emplace_back is
improperly used. This adds the functionality of warning the user when
an unecessary temporary is created while calling emplace_back or other "emplacy"
functions from the STL containers.
Reviewed By: kuhar, ivanmurashko
Differential Revision: https://reviews.llvm.org/D101471
This patch proposed to use a new cost model for loop interchange, which
is obtained from loop cache analysis.
Given a loopnest, what loop cache analysis returns is a vector of loops
[loop0, loop1, loop2, ...] where loop0 should be replaced as the outermost
loop, loop1 should be placed one more level inside, and loop2 one more level
inside, etc. What loop cache analysis does is not only more comprehensive than
the current cost model, it is also a "one-shot" query which means that we only
need to query it once during the entire loop interchange pass, which is better
than the current cost model where we query it every time we check whether it is
profitable to interchange two loops. Thus complexity is reduced, especially after
D120386 where we do more interchanges to get the globally optimal loop access pattern.
Updates made to test cases are mostly minor changes and some corrections.
Test coverage for loop interchange is not reduced.
Currently we did not completely remove the legacy cost model, but keep it as
fall-back in case the new cost model did not run successfully. This is because
currently we have some limitations in delinearization, which sometimes makes
loop cache analysis bail out. The longer term goal is to enhance delinearization
and eventually remove the legacy cost model compeletely.
Reviewed By: bmahjour, #loopoptwg
Differential Revision: https://reviews.llvm.org/D124926
This is alternative to https://reviews.llvm.org/D121733
and helps with Clang header modules in which FILE
may expand to "./foo.h" or "foo.h" depending on whether the file was
included directly or not.
Only do this when UseTargetPathSeparator is true, as we are already
changing the path in that case.
Reviewed By: ayzhao
Differential Revision: https://reviews.llvm.org/D126396
This patch removes all `IgnoreImpCasts` in Sema, and only uses it if necessary. If the expression is not of the same type as the pointer value, a cast is inserted.
Reviewed By: ABataev
Differential Revision: https://reviews.llvm.org/D126602
This patch improves the codegen of extractelement and insertelement for vector
containing 8 elements. Before, a dag combine transformation was generating a
sequence of 8 select/cmp.
This patch changes the upper limit for this transformation and the movrel
instruction will eventually be used instead. Extractlement/insertelement for
vectors containing less than 8 elements are unchanged.
Differential Revision: https://reviews.llvm.org/D126389
This makes it possible to crosscompile runtimes with cl.exe on Windows.
An external project is completely misconfigured otherwise because
cmake_args is set only for native builds or builds crosscompiled with
clang.
Differential Revision: https://reviews.llvm.org/D122578
Reviewed By: beanz, compnerd
While it's not as robust as using the attribute on enums/classes (the
type information may be lost through a function pointer, a declaration
or use of the underlying type without using the typedef, etc) but I
think there's still value in being able to attribute a typedef and have
all return types written with that typedef pick up the
warn_unused_result behavior.
Specifically I'd like to be able to annotate LLVMErrorRef (a wrapper for
llvm::Error used in the C API - the underlying type is a raw pointer, so
it can't be attributed itself) to reduce the chance of unhandled errors.
Differential Revision: https://reviews.llvm.org/D102122
If the imm is out of range for an ADDI, we will materialize it in
a register using multiple instructions. If the ADD is used by a
load/store, doPeepholeLoadStoreADDI can try to pull an ADDI from
the constant materialization into the load/store offset. This only
works if the ADD has a single use, otherwise the peephole would have
to rebuild multiple nodes.
This patch instead tries to solve the problem when the add is selected.
We check that the add is only used by loads/stores and if it is
we will select it to (ADDI (ADD X, Imm-Lo12), Lo12). This will enable
the simple case in doPeepholeLoadStoreADDI that can bypass an ADDI
used as a pointer. As a result we can remove the more complicated
peephole from doPeepholeLoadStoreADDI.
Reviewed By: reames
Differential Revision: https://reviews.llvm.org/D126576
.zdebug is unlikely used any longer: gcc -gz switched from legacy
.zdebug to SHF_COMPRESSED with binutils 2.26 (2016), which has been
several years. clang 14 dropped -gz=zlib-gnu support. According to
Debian Code Search (`gz=zlib-gnu`), no project uses -gz=zlib-gnu.
Remove .zdebug support to (a) simplify code and (b) allow removal of llvm-mc's
--compress-debug-sections=zlib-gnu.
In case the old object file `a.o` uses .zdebug, run `objcopy --decompress-debug-sections a.o`
Reviewed By: peter.smith
Differential Revision: https://reviews.llvm.org/D126793
This commit defines a dataflow analysis for integer ranges, which
uses a newly-added InferIntRangeInterface to compute the lower and
upper bounds on the results of an operation from the bounds on the
arguments. The range inference is a flow-insensitive dataflow analysis
that can be used to simplify code, such as by statically identifying
bounds checks that cannot fail in order to eliminate them.
The InferIntRangeInterface has one method, inferResultRanges(), which
takes a vector of inferred ranges for each argument to an op
implementing the interface and a callback allowing the implementation
to define the ranges for each result. These ranges are stored as
ConstantIntRanges, which hold the lower and upper bounds for a
value. Bounds are tracked separately for the signed and unsigned
interpretations of a value, which ensures that the impact of
arithmetic overflows is correctly tracked during the analysis.
The commit also adds a -test-int-range-inference pass to test the
analysis until it is integrated into SCCP or otherwise exposed.
Finally, this commit fixes some bugs relating to the handling of
region iteration arguments and terminators in the data flow analysis
framework.
Depends on D124020
Depends on D124021
Reviewed By: rriddle, Mogball
Differential Revision: https://reviews.llvm.org/D124023
advisor.
This patch has no functional change, and merely a preparation patch for
main functional change. The motivating use case is to annotate inline
remark pass name with context information (e.g. prelink or postlink,
CGSCC or always-inliner), see D125495 for more details.
Differential Revision: https://reviews.llvm.org/D126824
We could go either way on this and several similar matches.
Just matching as a binop is possibly slightly more efficient;
we don't need to re-confirm the opcode of the instruction.
If C is non-negative, the result of the smax must also be
non-negative, so all sign bits of the result are 0.
This allows DAGCombiner to remove a zext_inreg in the modified test.
This zext_inreg started as a sext that became zext before type
legalization then was promoted to a zext_inreg.
Reviewed By: reames
Differential Revision: https://reviews.llvm.org/D126896
In 6423a9f0ec, I accidentally thought this was
getting tested, but these variables are unused. Just remove the lines instead of
leaving them commented out.
Differential Revision: https://reviews.llvm.org/D126901
The memkind library is only available for linux. Calling dlopen here
can also be problematic in a client app that fork'ed.
Differential Revision: https://reviews.llvm.org/D126579
The example was still using the -now- removed sparse_tensor.init_tensor.
Also, I made the input operands of the matrix multiplication sparse too
(since it looks a bit strange to multiply two dense matrices into a sparse).
Reviewed By: bixia
Differential Revision: https://reviews.llvm.org/D126897
This patch includes MC layer support for VOP3 encoded instructions and generic VOP support
classes.
Some VOP1 and VOP2 instructions which share an encoding with gfx10 and are using
the AssemblerPredicate = isGFX10Plus are also enabled. That predicate
will be changed to isGFX10Only in a later patch.
Patch 15/N for upstreaming of AMDGPU gfx11 architecture.
Depends on D126468
Reviewed By: dp
Differential Revision: https://reviews.llvm.org/D126475
In strict mode, only the new inserted operation is allowed to add to the
worklist. Before this change, it would add the users of a replaced op
and it didn't check if the users are allowed to be pushed into the
worklist
Reviewed By: rriddle
Differential Revision: https://reviews.llvm.org/D126899
Improved/fixed cost modeling for shuffles by providing masks, improved
cost model for non-identity insertelements.
Differential Revision: https://reviews.llvm.org/D115462
The "#!" line in all scan-build-py scripts were using just bare
"/usr/bin/python" which according to PEP-0394 can be either python3,
python2 or not exist at all.
E.g in latest debian and ubuntu releases "/usr/bin/python" does not
exist at all by default and user must install python-is-python2 or
python-is-python3 packages to get the bare version less "python"
command.
Until recently (70b06fe8a1 "scan-build-py: Force the opening in utf-8"
changed "libscanbuild") these scripts worked in both python2 and
python3, but now they (rightfully) are python3 only, and broke on
systems where the "python" command means python2.
By changing the "#!" to be "python3" it is not only explicit that the
scripts require python3 it also works on systems where "python" command
is python2 or nonexistent.
Differential Revision: https://reviews.llvm.org/D126804
The function promises to canonicalize the path, but neglected to do so
for the root component.
For example, calling remove_dots("/tmp/foo.c", Style::windows_backslash)
resulted in "/tmp\foo.c". Now it produces "\tmp\foo.c".
Also fix FIXME in the corresponding test.
Reviewed By: rnk
Differential Revision: https://reviews.llvm.org/D126412
MC layer support for ds instructions
Contributors:
Piotr Sobczak <Piotr.Sobczak@amd.com>
Patch 14/N for upstreaming of AMDGPU gfx11 architecture.
Depends on D126463
Reviewed By: arsenm, #amdgpu
Differential Revision: https://reviews.llvm.org/D126468