Commit Graph

425512 Commits

Author SHA1 Message Date
Nicolas van Kempen 987f9cb6b9 [clang-tidy] Add proper emplace checks to modernize-use-emplace
modernize-use-emplace only recommends going from a push_back to an
emplace_back, but does not provide a recommendation when emplace_back is
improperly used. This adds the functionality of warning the user when
an unecessary temporary is created while calling emplace_back or other "emplacy"
functions from the STL containers.

Reviewed By: kuhar, ivanmurashko

Differential Revision: https://reviews.llvm.org/D101471
2022-06-03 00:14:57 +01:00
Congzhe Cao 006334470d [LoopInterchange] New cost model for loop interchange
This patch proposed to use a new cost model for loop interchange, which
is obtained from loop cache analysis.

Given a loopnest, what loop cache analysis returns is a vector of loops
[loop0, loop1, loop2, ...] where loop0 should be replaced as the outermost
loop, loop1 should be placed one more level inside, and loop2 one more level
inside, etc. What loop cache analysis does is not only more comprehensive than
the current cost model, it is also a "one-shot" query which means that we only
need to query it once during the entire loop interchange pass, which is better
than the current cost model where we query it every time we check whether it is
profitable to interchange two loops. Thus complexity is reduced, especially after
D120386 where we do more interchanges to get the globally optimal loop access pattern.

Updates made to test cases are mostly minor changes and some corrections.
Test coverage for loop interchange is not reduced.

Currently we did not completely remove the legacy cost model, but keep it as
fall-back in case the new cost model did not run successfully. This is because
currently we have some limitations in delinearization, which sometimes makes
loop cache analysis bail out. The longer term goal is to enhance delinearization
and eventually remove the legacy cost model compeletely.

Reviewed By: bmahjour, #loopoptwg

Differential Revision: https://reviews.llvm.org/D124926
2022-06-02 19:07:14 -04:00
Paul Pluzhnikov 4ad17d2e96 Clean "./" from __FILE__ expansion.
This is alternative to https://reviews.llvm.org/D121733
and helps with Clang header modules in which FILE
may expand to "./foo.h" or "foo.h" depending on whether the file was
included directly or not.

Only do this when UseTargetPathSeparator is true, as we are already
changing the path in that case.

Reviewed By: ayzhao

Differential Revision: https://reviews.llvm.org/D126396
2022-06-02 18:00:19 -04:00
Shilei Tian 3a96256b7e [Clang][OpenMP] Avoid using `IgnoreImpCasts` if possible
This patch removes all `IgnoreImpCasts` in Sema, and only uses it if necessary. If the expression is not of the same type as the pointer value, a cast is inserted.

Reviewed By: ABataev

Differential Revision: https://reviews.llvm.org/D126602
2022-06-02 17:45:02 -04:00
Paul Robinson aa1cdf87b5 [PS5] Ignore 'packed' on one-byte bitfields, matching PS4 2022-06-02 14:41:18 -07:00
Mehdi Amini 4e5ce2056e Revert "[mlir] Add integer range inference analysis"
This reverts commit 1350c9887d.

Shared library build is broken with undefined references.
2022-06-02 21:24:06 +00:00
Matt Arsenault dd7e407d81 AMDGPU: Move SpilledReg from MFI to SIRegisterInfo
This isn't the most natural place for it, but it avoids a circular
include dependency in an out of tree patch.
2022-06-02 17:11:24 -04:00
Julien Pages 2dfe419446 [AMDGPU] Improve codegen of extractelement/insertelement in some cases
This patch improves the codegen of extractelement and insertelement for vector
containing 8 elements. Before, a dag combine transformation was generating a
sequence of 8 select/cmp.
This patch changes the upper limit for this transformation and the movrel
instruction will eventually be used instead. Extractlement/insertelement for
vectors containing less than 8 elements are unchanged.

Differential Revision: https://reviews.llvm.org/D126389
2022-06-02 17:05:55 -04:00
Alexander Smarus 4e1b89064f cmake fill `cmake_args` when cross-compiling external project with non-clang compiler
This makes it possible to crosscompile runtimes with cl.exe on Windows.

An external project is completely misconfigured otherwise because
cmake_args is set only for native builds or builds crosscompiled with
clang.

Differential Revision: https://reviews.llvm.org/D122578
Reviewed By: beanz, compnerd
2022-06-02 21:02:58 +00:00
David Blaikie cb08f4aa44 Support warn_unused_result on typedefs
While it's not as robust as using the attribute on enums/classes (the
type information may be lost through a function pointer, a declaration
or use of the underlying type without using the typedef, etc) but I
think there's still value in being able to attribute a typedef and have
all return types written with that typedef pick up the
warn_unused_result behavior.

Specifically I'd like to be able to annotate LLVMErrorRef (a wrapper for
llvm::Error used in the C API - the underlying type is a raw pointer, so
it can't be attributed itself) to reduce the chance of unhandled errors.

Differential Revision: https://reviews.llvm.org/D102122
2022-06-02 20:57:31 +00:00
Craig Topper dbead2388b [RISCV] Add custom isel for (add X, imm) used by load/stores.
If the imm is out of range for an ADDI, we will materialize it in
a register using multiple instructions. If the ADD is used by a
load/store, doPeepholeLoadStoreADDI can try to pull an ADDI from
the constant materialization into the load/store offset. This only
works if the ADD has a single use, otherwise the peephole would have
to rebuild multiple nodes.

This patch instead tries to solve the problem when the add is selected.
We check that the add is only used by loads/stores and if it is
we will select it to (ADDI (ADD X, Imm-Lo12), Lo12). This will enable
the simple case in doPeepholeLoadStoreADDI that can bypass an ADDI
used as a pointer. As a result we can remove the more complicated
peephole from doPeepholeLoadStoreADDI.

Reviewed By: reames

Differential Revision: https://reviews.llvm.org/D126576
2022-06-02 13:45:32 -07:00
Florian Hahn 78c6b1488f
[CaptureTracking] Increase limit and use it for all visited uses.
Currently the MaxUsesToExplore limit only applies to the number of users
per value, not the total number of users to explore.

The current limit of 20 pessimizes IR with opaque pointers in some
cases. Without opaque pointers, we have deeper pointer def-use chains in
general due to extra bitcasts and geps for structs with index 0.

With opaque pointers the def-use chain is not as deep but wider, due to
bitcasts & 0-geps missing.

To improve the situation for opaque pointers, this patch does 2 things:

 1. Apply the limit to the total number of uses visited. From the
    wording in the description of the option it seems like this may be
    the original intention. With the current implementation we could
    still end up walking a lot of uses.
 2. Increase the limit to 100. This is quite arbitrary, but enables
    a good number of additional optimizations.

Those adjustments have a noticeable compile-time impact though. In part
that is likely due to additional transformations (and conversely
the current baseline misses optimizations after switching to opaque
pointers).

This recovers some regressions that showed up after enabling opaque
pointers.

Limit=100:

* NewPM-O3: +0.21%
* NewPM-ReleaseThinLTO: +0.87%
* NewPM-ReleaseLTO-g: +0.46%

https://llvm-compile-time-tracker.com/compare.php?from=2e50ecb2ef4e1da1aeab05bcf66380068e680991&to=7e6fbe519d958d09f32f01d5d44a622f551e2031&stat=instructions

Limit=60:

* NewPM-O3: +0.14%
* NewPM-ReleaseThinLTO: +0.41%
* NewPM-ReleaseLTO-g: +0.21%

https://llvm-compile-time-tracker.com/compare.php?from=aeb19817d66f1a15754163c7f48e01e9ebdd6d45&to=520563fdc146319aae90d06f88d87f2e9e1247b7&stat=instructions

Limit=40:
* NewPM-O3: +0.11%
* NewPM-ReleaseThinLTO: +0.12%
* NewPM-ReleaseLTO-g: +0.09%

https://llvm-compile-time-tracker.com/compare.php?from=aeb19817d66f1a15754163c7f48e01e9ebdd6d45&to=c9182576e9fe3f1c84a71479665aef91a416318c&stat=instructions

Reviewed By: nikic

Differential Revision: https://reviews.llvm.org/D126236
2022-06-02 21:43:58 +01:00
Fangrui Song e09f77d394 [ELF] Remove support for legacy .zdebug sections
.zdebug is unlikely used any longer: gcc -gz switched from legacy
.zdebug to SHF_COMPRESSED with binutils 2.26 (2016), which has been
several years. clang 14 dropped -gz=zlib-gnu support. According to
Debian Code Search (`gz=zlib-gnu`), no project uses -gz=zlib-gnu.

Remove .zdebug support to (a) simplify code and (b) allow removal of llvm-mc's
--compress-debug-sections=zlib-gnu.

In case the old object file `a.o` uses .zdebug, run `objcopy --decompress-debug-sections a.o`

Reviewed By: peter.smith

Differential Revision: https://reviews.llvm.org/D126793
2022-06-02 13:37:19 -07:00
Fangrui Song dfa9221aa7 [docs] Mention LLVMContext::setOpaquePointers for C++ API 2022-06-02 13:28:42 -07:00
Krzysztof Drewniak 1350c9887d [mlir] Add integer range inference analysis
This commit defines a dataflow analysis for integer ranges, which
uses a newly-added InferIntRangeInterface to compute the lower and
upper bounds on the results of an operation from the bounds on the
arguments. The range inference is a flow-insensitive dataflow analysis
that can be used to simplify code, such as by statically identifying
bounds checks that cannot fail in order to eliminate them.

The InferIntRangeInterface has one method, inferResultRanges(), which
takes a vector of inferred ranges for each argument to an op
implementing the interface and a callback allowing the implementation
to define the ranges for each result. These ranges are stored as
ConstantIntRanges, which hold the lower and upper bounds for a
value. Bounds are tracked separately for the signed and unsigned
interpretations of a value, which ensures that the impact of
arithmetic overflows is correctly tracked during the analysis.

The commit also adds a -test-int-range-inference pass to test the
analysis until it is integrated into SCCP or otherwise exposed.

Finally, this commit fixes some bugs relating to the handling of
region iteration arguments and terminators in the data flow analysis
framework.

Depends on D124020

Depends on D124021

Reviewed By: rriddle, Mogball

Differential Revision: https://reviews.llvm.org/D124023
2022-06-02 20:24:11 +00:00
Mingming Liu 8601f269f1 [Inline][Remark][NFC] Optionally provide inline context to inline
advisor.

This patch has no functional change, and merely a preparation patch for
main functional change. The motivating use case is to annotate inline
remark pass name with context information (e.g. prelink or postlink,
CGSCC or always-inliner), see D125495 for more details.

Differential Revision: https://reviews.llvm.org/D126824
2022-06-02 13:14:30 -07:00
Adrian Prantl e7b929d756 Adapt IRForTarget::RewriteObjCConstStrings() for D126689.
With opaque pointers, the LLVM IR expected by this function changed.
2022-06-02 13:06:40 -07:00
Philip Reames 76ac916d63 [RISCV] Inline one copy of needVSETVLI into the other [NFC]
Calling the non-MI version directly was unsound (as fixed in dcdb0bf2), so remove that version to decrease likelyhood of future mistakes.
2022-06-02 13:06:18 -07:00
Xiang Li 6bea9ff913 [HLSL] Add WaveActiveCountBits as Langugage builtin function for HLSL
One clang builtins are introduced
 uint WaveActiveCountBits( bool bBit ) as Langugage builtin function for HLSL.

The detail for WaveActiveCountBits is at
https://github.com/microsoft/DirectXShaderCompiler/wiki/Wave-Intrinsics#uint-waveactivecountbits-bool-bbit-

This is only clang part change to make WaveActiveCountBits into AST.
llvm intrinsic for WaveActiveCountBits will be add in separate PR.

Reviewed By: Anastasia

Differential Revision: https://reviews.llvm.org/D126857
2022-06-02 13:06:01 -07:00
Sanjay Patel 1882c25f12 [InstCombine] add tests for mul with low-bit mask operand; NFC 2022-06-02 16:01:23 -04:00
Sanjay Patel 8689463bfb [InstCombine] make pattern matching more consistent; NFC
We could go either way on this and several similar matches.
Just matching as a binop is possibly slightly more efficient;
we don't need to re-confirm the opcode of the instruction.
2022-06-02 16:01:23 -04:00
Maksim Panchenko 986e5dedf2 [BOLT][NFC] Fix braces in BinaryEmitter
Reviewed By: rafauler

Differential Revision: https://reviews.llvm.org/D126844
2022-06-02 12:45:25 -07:00
Craig Topper fa20bf1636 [DAGCombiner][RISCV] Improve computeKnownBits for (smax X, C) where C is non-negative.
If C is non-negative, the result of the smax must also be
non-negative, so all sign bits of the result are 0.

This allows DAGCombiner to remove a zext_inreg in the modified test.
This zext_inreg started as a sext that became zext before type
legalization then was promoted to a zext_inreg.

Reviewed By: reames

Differential Revision: https://reviews.llvm.org/D126896
2022-06-02 12:34:24 -07:00
Joe Loser 4be36dc77f
[libc++][test] Fix unused variable warning in string_view tests
In 6423a9f0ec, I accidentally thought this was
getting tested, but these variables are unused. Just remove the lines instead of
leaving them commented out.

Differential Revision: https://reviews.llvm.org/D126901
2022-06-02 13:33:37 -06:00
Daniel Douglas 5d25dbff67 [OpenMP][libomp] do not try to dlopen libmemkind on macOS
The memkind library is only available for linux. Calling dlopen here
can also be problematic in a client app that fork'ed.

Differential Revision: https://reviews.llvm.org/D126579
2022-06-02 14:28:09 -05:00
Paul Robinson 30b7ffe74e [PS5] Pack non-POD members in packed structs, matching PS4 ABI 2022-06-02 12:26:26 -07:00
Paul Robinson bb7835e2a7 [PS5] Apply 'packed' attribute to base classes, matching PS4 ABI 2022-06-02 12:26:26 -07:00
Florian Hahn 44c86e5cdc
[GVN] Add test for capture tracking use limit.
Test for capture-tracking-max-uses-to-explore, adjusted in D126236.
2022-06-02 20:15:26 +01:00
Aart Bik bf7dbc2a30 [mlir][sparse][bufferization] fix doc on new init operation
The example was still using the -now- removed sparse_tensor.init_tensor.
Also, I made the input operands of the matrix multiplication sparse too
(since it looks a bit strange to multiply two dense matrices into a sparse).

Reviewed By: bixia

Differential Revision: https://reviews.llvm.org/D126897
2022-06-02 12:04:36 -07:00
Adrian Prantl 8eed95c83e Adapt IRForTarget::RewriteObjCSelector() for D126689.
With opaque pointers, the LLVM IR expected by this function changed.
2022-06-02 11:42:28 -07:00
Joe Nash 3732cd59be [AMDGPU] gfx11 vop3 and inherited vop instructions
This patch includes MC layer support for VOP3 encoded instructions and generic VOP support
classes.
Some VOP1 and VOP2 instructions which share an encoding with gfx10 and are using
the AssemblerPredicate = isGFX10Plus are also enabled. That predicate
will be changed to isGFX10Only in a later patch.

Patch 15/N for upstreaming of AMDGPU gfx11 architecture.

Depends on D126468

Reviewed By: dp

Differential Revision: https://reviews.llvm.org/D126475
2022-06-02 14:03:02 -04:00
Chia-hung Duan 2aeffc6d8d [mlir:MultiOpDriver] Don't add ops which are not in the allowed list
In strict mode, only the new inserted operation is allowed to add to the
worklist. Before this change, it would add the users of a replaced op
and it didn't check if the users are allowed to be pushed into the
worklist

Reviewed By: rriddle

Differential Revision: https://reviews.llvm.org/D126899
2022-06-02 18:27:37 +00:00
Alexey Bataev 9980c99718 [SLP]Improve shuffles cost estimation where possible.
Improved/fixed cost modeling for shuffles by providing masks, improved
cost model for non-identity insertelements.

Differential Revision: https://reviews.llvm.org/D115462
2022-06-02 11:18:14 -07:00
Anders Waldenborg 4c1e487c41 scan-build-py: Change scripts to explicitly require python3
The "#!" line in all scan-build-py scripts were using just bare
"/usr/bin/python" which according to PEP-0394 can be either python3,
python2 or not exist at all.

E.g in latest debian and ubuntu releases "/usr/bin/python" does not
exist at all by default and user must install python-is-python2 or
python-is-python3 packages to get the bare version less "python"
command.

Until recently (70b06fe8a1 "scan-build-py: Force the opening in utf-8"
changed "libscanbuild") these scripts worked in both python2 and
python3, but now they (rightfully) are python3 only, and broke on
systems where the "python" command means python2.

By changing the "#!" to be "python3" it is not only explicit that the
scripts require python3 it also works on systems where "python" command
is python2 or nonexistent.

Differential Revision: https://reviews.llvm.org/D126804
2022-06-02 20:08:21 +02:00
Paul Pluzhnikov 35ab2a11bb Fix a buglet in remove_dots().
The function promises to canonicalize the path, but neglected to do so
for the root component.

For example, calling remove_dots("/tmp/foo.c", Style::windows_backslash)
resulted in "/tmp\foo.c". Now it produces "\tmp\foo.c".

Also fix FIXME in the corresponding test.

Reviewed By: rnk

Differential Revision: https://reviews.llvm.org/D126412
2022-06-02 11:07:44 -07:00
Joe Nash e4870c8357 [AMDGPU] gfx11 ds instructions
MC layer support for ds instructions

Contributors:
Piotr Sobczak <Piotr.Sobczak@amd.com>

Patch 14/N for upstreaming of AMDGPU gfx11 architecture.

Depends on D126463

Reviewed By: arsenm, #amdgpu

Differential Revision: https://reviews.llvm.org/D126468
2022-06-02 13:36:56 -04:00
Paul Robinson dc5175adef [PS5] Make passing unions in registers match PS4 ABI 2022-06-02 11:00:54 -07:00
Paul Robinson cc756f91c3 [PS5] Classify __m64 as integer, matching PS4 ABI 2022-06-02 11:00:53 -07:00
Balazs Benics 7d24641f89 [llvm][analyzer][NFC] Introduce SFINAE for specializing FoldingSetTraits
Reviewed By: martong

Differential Revision: https://reviews.llvm.org/D126803
2022-06-02 19:46:38 +02:00
Balazs Benics cf1f1b7240 [analyzer][NFC] Uplift checkers after D126801
Reviewed By: martong

Differential Revision: https://reviews.llvm.org/D126802
2022-06-02 19:46:38 +02:00
Balazs Benics 33ca5a447e [analyzer][NFC] Add partial specializations for ProgramStateTraits
I'm also hoisting common code from the existing specializations into a
common trait impl to reduce code duplication.

Reviewed By: martong

Differential Revision: https://reviews.llvm.org/D126801
2022-06-02 19:46:38 +02:00
Will Hawkins 7b291b6f50 [libc++] Fix typo in comment at __optional_storage_base
Small typo fix(es) for struct definition of __optional_storage_base for
a reference type.

Reviewed By: #libc, jloser, philnik

Differential Revision: https://reviews.llvm.org/D126621
2022-06-02 19:46:04 +02:00
Ashay Rane 5fee1799f4
[mlir] translate memref.reshape with static shapes but dynamic dims
Prior to this patch, the lowering of memref.reshape operations to the
LLVM dialect failed if the shape argument had a static shape with
dynamic dimensions.  This patch adds the necessary support so that when
the shape argument has dynamic values, the lowering probes the dimension
at runtime to set the size in the `MemRefDescriptor` type.  This patch
also computes the stride for dynamic dimensions by deriving it from the
sizes of the inner dimensions.

Reviewed By: ftynse

Differential Revision: https://reviews.llvm.org/D126604
2022-06-02 10:00:58 -07:00
Craig Topper 01ba470826 [RISCV] Add test case showing unnecessary extend after i32 smax on rv64. NFC
One of the operands of the smax is a positive value so computeKnownBits
determines the result of the smax must always be positive. This allows
DAG combiner to convert the sign extend to zero extend before type
legalization.

After type legalization the smax is promoted to i64 by sign extending
its inputs and the zero extend becomes an AND instruction. We are unable
to remove the AND at this point and it becomes a pair of shifts or a
zext.w.

The result of smax has as many sign bits as the minimum of its inputs.
Had we kept the sign extend instead of turning it into a zero extend
it would be removed by DAG combiner after type legalization.
2022-06-02 09:58:11 -07:00
Luís Ferreira 3da4f9c57b [lldb][NFC] Move non-clang specific method to the generic DWARF Parser
This patch renames DW_ACCESS_to_AccessType function and move it to the abstract
DWARFASTParser, since there is no clang-specific code there. This is useful for
plugins other than Clang.

Reviewed By: shafik, bulbazord

Differential Revision: https://reviews.llvm.org/D114719
2022-06-02 16:39:39 +00:00
David CARLIER 2ba5d820e2 [OpenMP] omp_get_proc_id uses sched_getcpu fallback on FreeBSD 13.1 and above.
Reviewers: jlpeyton, jdoerfert

Reviewed-By: jlpeyton

Differential-Revision: https://reviews.llvm.org/D126408
2022-06-02 17:10:29 +01:00
Mikael Simberg e27ce28139 [OpenMP][libomp] Make LIBOMP_CONFIGURED_LIBFLAGS a list instead of string
When configuring llvm with the openmp subproject, the build for the omp
target fails if LIBOMP_CONFIGURED_LIBFLAGS contains more than one item.
LIBOMP_CONFIGURED_LIBFLAGS should be a semicolon-separated list instead
of a string with items separated by spaces.

Differential Revision: https://reviews.llvm.org/D125370
2022-06-02 10:50:21 -05:00
Liqiang Tao 14e8add939 [llvm][ModuleInliner] Refactor InlineSizePriority and PriorityInlineOrder
This patch introduces the abstract base class InlinePriority to serve as
the comparison function for the priority queue.  A derived class, such
as SizePriority, may choose to cache the priorities for different
functions for performance reasons.

This design shields the type used for the priority away from classes
outside InlinePriority and classes derived from it.  In turn,
PriorityInlineOrder no longer needs to be a template class.

Reviewed By: kazu

Differential Revision: https://reviews.llvm.org/D126300
2022-06-02 23:40:26 +08:00
Liqiang Tao 5c6ed60c51 Revert "[llvm][ModuleInliner] Refactor InlineSizePriority and PriorityInlineOrder"
This reverts commit 50de7f1e77.
2022-06-02 23:18:47 +08:00
Mark de Wever 89818f2dc0 [libc++] Lets to_chars use header implementation.
This removes the duplicated code from the dylib. Instead the dylib will
call the new functions in the header. Since this code is unneeded it's
removed from the unstable ABI.

Depends on D125704

Reviewed By: #libc, ldionne

Differential Revision: https://reviews.llvm.org/D125761
2022-06-02 17:11:32 +02:00