Commit Graph

419700 Commits

Author SHA1 Message Date
Stefan Pintilie 585c85abe5 [PowerPC] Fix lowering of byval parameters for sizes greater than 8 bytes.
To store a byval parameter the existing code would store as many 8 byte elements
as was required to store the full size of the byval parameter.
For example, a paramter of size 16 would store two element of 8 bytes.
A paramter of size 12 would also store two elements of 8 bytes.
This would sometimes store too many bytes as the size of the paramter is not
always a factor of 8.

This patch fixes that issue and now byval paramters are stored with the correct
number of bytes.

Reviewed By: nemanjai, #powerpc, quinnp, amyk

Differential Revision: https://reviews.llvm.org/D121430
2022-03-31 15:12:46 -05:00
Vy Nguyen 33e197112a [llvm-readobj] Support non 64bit platforms too
(Orignal phab: https://reviews.llvm.org/D116787)
2022-03-31 15:40:12 -04:00
Valentin Clement 868c212f42
[flang] Keep fully qualified !fir.heap type for fir.freemem op
Re-introduce a fully qualified type on teh fir.freemem operation.
Since this is the only operation where the prefix gets elided in fir, this
patch make it fully qualified so the dialect syntax feels more consistent.

Reviewed By: vdonaldson

Differential Revision: https://reviews.llvm.org/D122839
2022-03-31 21:37:21 +02:00
Vladislav Khmelevsky 4c14519ecb [BOLT] LongJmp: Check for shouldEmit
Check that the function will be emitted in the final binary. Preserving
old function address is needed in case it is PLT trampiline, that is
currently not moved by the BOLT.

Differential Revision: https://reviews.llvm.org/D122098
2022-03-31 22:33:09 +03:00
Vladislav Khmelevsky fed958c6cc [BOLT] AArch64: Emit text objects
BOLT treats aarch64 objects located in text as empty functions with
contant islands. Emit them with at least 8-byte alignment to the new
text section.

Vladislav Khmelevsky,
Advanced Software Technology Lab, Huawei

Differential Revision: https://reviews.llvm.org/D122097
2022-03-31 22:28:50 +03:00
Nico Weber 1c5663458b better syntax 2022-03-31 15:25:43 -04:00
Vy Nguyen e6e5e3e025 [llvm-readobj] Fix forward build breakages caused by https://reviews.llvm.org/rG33b3c86afab06ad61d46456c85c0b565cfff8287
Change: use std::function instead of function_ref because it's not safe to store a function_ref

(original phab: https://reviews.llvm.org/D116787)
2022-03-31 15:22:51 -04:00
River Riddle 59bbc7a085 [GreedPatternRewriter] Preprocess constants while building worklist when not processing top down
This avoids accidentally reversing the order of constants during successive
application, e.g. when running the canonicalizer. This helps reduce the number
of iterations, and also avoids unnecessary changes to input IR.

Fixes #51892

Differential Revision: https://reviews.llvm.org/D122692
2022-03-31 12:08:55 -07:00
Stefan Pintilie 2e55bc9f3c [PowerPC] Set the special DSCR with a compiler option.
Add a compiler option and the instructions required to set the
special Data Stream Control Register (DSCR). The special register will
not be set by default.

Original patch by: Muhammad Usman

Reviewed By: nemanjai, #powerpc

Differential Revision: https://reviews.llvm.org/D117013
2022-03-31 14:06:30 -05:00
Mark de Wever 9c54a0c97a [libc++] Fixes calendar function visibility.
Note of the functions was marked as _LIBCPP_HIDE_FROM_ABI.

Did some minor formatting fixes to keep consistency. To keep it easy to
review not all odd formatting has been fixed.

Reviewed By: ldionne, #libc

Differential Revision: https://reviews.llvm.org/D122834
2022-03-31 20:48:24 +02:00
Jonathan Peyton d345fe7c22 [OpenMP][libomp] NFC: Move omp_* functions out of kmp_* section 2022-03-31 13:39:30 -05:00
Vy Nguyen 1ae449f9a3 Reland "[llvm-readobj][MachO] Add option to sort the symbol table before dumping (MachO only, for now)."
https://reviews.llvm.org/D116787

This reverts commit 33b3c86afa.

New change: fixed build failures:
 - in stabs-sorted:restore the the ERR-KEY statements, which were accidentally deleted during refactoring
 - in ObjDumper.h/MachODumper.cpp: refactor so that current dumpers which didn't provide an impl that accept a SymCom still works
2022-03-31 14:21:41 -04:00
Alex Zinenko 3a4ada6991 Revert "Added an empty __init__.py file to the MLIR Python bindings"
This reverts commit b50893db52.

Post-commit review pointed out that adding this file will require the
entire Python tree (including out-of-tree projects) to come from the
same directory, which might be problematic in non-default installations.
Reverting pending further discussion.
2022-03-31 20:03:52 +02:00
Florian Hahn 14e3650f01
Revert "Recommit "[LV] Remove unneeded createHeaderBranch.(NFCI)""
This reverts commit 8378a71b6c.

It looks like this patch uncovered another issue, e.g. see
https://lab.llvm.org/buildbot/#/builders/168/builds/5518
2022-03-31 19:00:48 +01:00
LLVM GN Syncbot c7639f896c [gn build] Port 46774df307 2022-03-31 17:50:34 +00:00
Roger Kim 34b9729561 [lld-macho][NFC] Encapsulate symbol priority implementation.
Just some code clean up.

Reviewed By: #lld-macho, int3

Differential Revision: https://reviews.llvm.org/D122752
2022-03-31 13:47:38 -04:00
Aaron Ballman 0e890904ea Use functions with prototypes when appropriate; NFC
A significant number of our tests in C accidentally use functions
without prototypes. This patch converts the function signatures to have
a prototype for the situations where the test is not specific to K&R C
declarations. e.g.,

  void func();

becomes

  void func(void);
2022-03-31 13:45:39 -04:00
Paul Kirth 46774df307 [misexpect] Re-implement MisExpect Diagnostics
Reimplements MisExpect diagnostics from D66324 to reconstruct its
original checking methodology only using MD_prof branch_weights
metadata.

New checks rely on 2 invariants:

1) For frontend instrumentation, MD_prof branch_weights will always be
   populated before llvm.expect intrinsics are lowered.

2) for IR and sample profiling, llvm.expect intrinsics will always be
   lowered before branch_weights are populated from the IR profiles.

These invariants allow the checking to assume how the existing branch
weights are populated depending on the profiling method used, and emit
the correct diagnostics. If these invariants are ever invalidated, the
MisExpect related checks would need to be updated, potentially by
re-introducing MD_misexpect metadata, and ensuring it always will be
transformed the same way as branch_weights in other optimization passes.

Frontend based profiling is now enabled without using LLVM Args, by
introducing a new CodeGen option, and checking if the -Wmisexpect flag
has been passed on the command line.

Reviewed By: tejohnson

Differential Revision: https://reviews.llvm.org/D115907
2022-03-31 17:38:21 +00:00
Joachim Protze 7641e42def [OpenMP][Tools] Fix handling of initial-task-end
Latest OpenMP spec says parallel_data is NULL for initial/implicit-task-end.
We nevertheless need to cleanup the ParallelData here, as there is no other
callback for the end of the implicit parallel region. We can use the reference
stored in the TaskData.

Reviewed By: dreachem

Differential Revision: https://reviews.llvm.org/D114005
2022-03-31 12:33:40 -05:00
Thomas Symalla 1a6aa8b195 [AMDGPU] Add missing use check in SIOptimizeExecMasking pass.
Whenever a v_cmp, s_and_saveexec instruction sequence shall be
transformed to an equivalent s_mov, v_cmpx sequence, it needs
to be detected if the v_cmp target register is used between
the two instructions as the v_cmp result gets omitted by
using the v_cmpx instruction, resulting in invalid code.

Reviewed By: foad

Differential Revision: https://reviews.llvm.org/D122797
2022-03-31 19:25:35 +02:00
Simon Pilgrim 535211c3eb [X86] Remove redundant FIXME
lowerV64I8Shuffle has been extended a lot since this was added.
2022-03-31 18:05:52 +01:00
Simon Pilgrim fac1729924 [X86] lowerV64I8Shuffle - don't use lowerShuffleWithPERMV until we've tried simpler options
Shuffle combining will still lower to this with better fast cross lane checks.

Noticed while triaging Issue #54658
2022-03-31 18:05:51 +01:00
Okwan Kwon 65bdeddb1e [mlir] Bubble up tensor.extract_slice above linalg operation
Bubble up extract_slice above Linalg operation.

A sequence of operations

    %0 = linalg.<op> ... arg0, arg1, ...
    %1 = tensor.extract_slice %0 ...

can be replaced with

    %0 = tensor.extract_slice %arg0
    %1 = tensor.extract_slice %arg1
    %2 = linalg.<op> ... %0, %1, ...

This results in the reduce computation of the linalg operation.

The implementation uses the tiling utility functions. One difference
from the tiling process is that we don't need to insert the checking
code for the out-of-bound accesses. The use of the slice itself
represents that the code writer is sure about the boundary condition.
To avoid adding the boundary condtion check code, `omitPartialTileCheck`
is introduced for the tiling utility functions.

Differential Revision: https://reviews.llvm.org/D122437
2022-03-31 16:48:38 +00:00
Chris Bieneman 19054163e1 [HLSL] Further improve to numthreads diagnostics
This adds diagnostics for conflicting attributes on the same
declarataion, conflicting attributes on a forward and final
declaration, and defines a more narrowly scoped HLSLEntry attribute
target.

Big shout out to @aaron.ballman for the great feedback and review on
this!
2022-03-31 11:34:01 -05:00
Abinav Puthan Purayil 898d5776ec [AMDGPU][GlobalISel] Scalarize add/sub with overflow ops in the legalizer
Differential Revision: https://reviews.llvm.org/D122803
2022-03-31 21:46:34 +05:30
Abinav Puthan Purayil db17ebd593 [AMDGPU][GlobalISel] Add end to end IR tests for add/sub with overflow
Differential Revision: https://reviews.llvm.org/D122818
2022-03-31 21:46:34 +05:30
Aaron Ballman 2267549296 Fix the build after cd26190a10
These variables were being used uninitialized and it caused a
significant number of test failures on Windows.
2022-03-31 12:03:53 -04:00
Kirill Bobyrev f43c4c5be2 Revert "[clangd] IncludeCleaner: Add support for IWYU pragma private"
This reverts commit 4cb38bfe76.

Awkwardly enough, this builds Windows buildbots:

http://45.33.8.238/win/55402/step_9.txt

It is yet unclear why this is happening but I will need more time to
diagnose the issue.
2022-03-31 17:59:52 +02:00
Michał Górny 09b53121c3 [compiler-rt] [scudo] Use -mcrc32 on x86 when available
Update the hardware CRC32 logic in scudo to support using `-mcrc32`
instead of `-msse4.2`.  The CRC32 intrinsics use the former flag
in the newer compiler versions, e.g. in clang since 12fa608af4.
With these compilers, passing `-msse4.2` is insufficient to enable
the instructions and causes build failures when `-march` does not enable
CRC32:

    /var/tmp/portage/sys-libs/compiler-rt-sanitizers-14.0.0/work/compiler-rt/lib/scudo/scudo_crc32.cpp:20:10: error: always_inline function '_mm_crc32_u32' requires target feature 'crc32', but would be inlined into function 'computeHardwareCRC32' that is compiled without support for 'crc32'
      return CRC32_INTRINSIC(Crc, Data);
             ^
    /var/tmp/portage/sys-libs/compiler-rt-sanitizers-14.0.0/work/compiler-rt/lib/scudo/scudo_crc32.h:27:27: note: expanded from macro 'CRC32_INTRINSIC'
    #  define CRC32_INTRINSIC FIRST_32_SECOND_64(_mm_crc32_u32, _mm_crc32_u64)
                              ^
    /var/tmp/portage/sys-libs/compiler-rt-sanitizers-14.0.0/work/compiler-rt/lib/scudo/../sanitizer_common/sanitizer_platform.h:132:36: note: expanded from macro 'FIRST_32_SECOND_64'
    #  define FIRST_32_SECOND_64(a, b) (a)
                                       ^
    1 error generated.

For backwards compatibility, use `-mcrc32` when available and fall back
to `-msse4.2`.  The `<smmintrin.h>` header remains in use as it still
works and is compatible with GCC, while clang's `<crc32intrin.h>`
is not.

Originally reported in https://bugs.gentoo.org/835870.

Differential Revision: https://reviews.llvm.org/D122789
2022-03-31 17:49:42 +02:00
Siva Chandra 97417e0300 [libc] Enable threads.h functions on aarch64.
Reviewed By: lntue

Differential Revision: https://reviews.llvm.org/D122788
2022-03-31 08:42:07 -07:00
Sven van Haastregt 4dfec37037 [OpenCL] Set MinVersion for sub_group_barrier with memory_scope
The memory_scope enum is not available before OpenCL 2.0, so ensure
the sub_group_barrier overload with a memory_scope argument is
restricted to OpenCL 2.0 and above.  This is already the case in
opencl-c.h.

Fixes the issue revealed by https://reviews.llvm.org/D120254

Reported-by: Harald van Dijk (hvdijk)
2022-03-31 16:41:40 +01:00
Mark de Wever 11c14bca58 [libc++][ci] Installs Japanese locale in Docker.
The alternative outputs of std::put_time and std::strftime are the
easiest to test with the Japanese locale. This is a preparation for the
tests of the chrono formatters.

Note since it takes a while before the Docker file changes propagate to
the build nodes the verification of the locale is done in a separate
patch.

Reviewed By: ldionne, #libc

Differential Revision: https://reviews.llvm.org/D122736
2022-03-31 17:41:06 +02:00
Mark de Wever e3ad15d7ff [libc++][doc] Update formatting status.
Reduced the details of the non-chrono formatting information. This has
been shipped and these details part of P0645 which is still documented.
Removing this information keeps the information up-to-date.

Adds the formatters required for the types chrono namespace.

Reviewed By: ldionne, #libc

Differential Revision: https://reviews.llvm.org/D122735
2022-03-31 17:37:51 +02:00
Vince Bridgers 4d5b824e3d [analyzer] Avoid checking addrspace pointers in cstring checker
This change fixes an assert that occurs in the SMT layer when refuting a
finding that uses pointers of two different sizes. This was found in a
downstream build that supports two different pointer sizes, The CString
Checker was attempting to compute an overlap for the 'to' and 'from'
pointers, where the pointers were of different sizes.

In the downstream case where this was found, a specialized memcpy
routine patterned after memcpy_special is used. The analyzer core hits
on this builtin because it matches the 'memcpy' portion of that builtin.
This cannot be duplicated in the upstream test since there are no
specialized builtins that match that pattern, but the case does
reproduce in the accompanying LIT test case. The amdgcn target was used
for this reproducer. See the documentation for AMDGPU address spaces here
https://llvm.org/docs/AMDGPUUsage.html#address-spaces.

The assert seen is:

`*Solver->getSort(LHS) == *Solver->getSort(RHS) && "AST's must have the same sort!"'

Ack to steakhal for reviewing the fix, and creating the test case.

Reviewed By: steakhal

Differential Revision: https://reviews.llvm.org/D118050
2022-03-31 17:34:56 +02:00
Peter Waller f1cb816f90 [AArch64][SVE] Mark {CNT*,RDVL,INDEX} as materializable
Differential Revision: https://reviews.llvm.org/D122731
2022-03-31 15:28:24 +00:00
Fraser Cormack ee51aefba0 [RISCV][NFC] Minor formatting fix 2022-03-31 16:15:22 +01:00
Jay Foad e8e32e5714 [AMDGPU] Fix typo in RUN line 2022-03-31 16:23:40 +01:00
Wenju He 0bda12b5bc [NewPM] Add OptimizerEarly module extension point
VectorizerStart extension is module callback in old PM, but is function
callback in new PM. We lack a module extension point between end of
buildModuleSimplificationPipeline and the function optimization
(including vectorizer) pipeline. So this patch adds a new module
extension point before the function optimization pipeline.

Reviewed By: aeubanks

Differential Revision: https://reviews.llvm.org/D122296
2022-03-31 08:22:27 -07:00
Groverkss 152e501d87 [MLIR][Presburger] Carry IdKind information in LinearTransform::applyTo
This patch fixes a bug in LinearTransform::applyTo where it did not carry the
IdKind information, and instead treated every id as IdKind::Domain.

Reviewed By: arjunp

Differential Revision: https://reviews.llvm.org/D122823
2022-03-31 20:42:50 +05:30
Nico Weber d2f7547f14 [gn build] (manually) port 19246b0779 2022-03-31 11:10:18 -04:00
David Goldman d9739f29cd Serialize PragmaAssumeNonNullLoc to support preambles
Previously, if a `#pragma clang assume_nonnull begin` was at the
end of a premable with a `#pragma clang assume_nonnull end` at the
end of the main file, clang would diagnose an unterminated begin in
the preamble and an unbalanced end in the main file.

With this change, those errors no longer occur and the case above is
now properly handled. I've added a corresponding test to clangd,
which makes use of preambles, in order to verify this works as
expected.

Differential Revision: https://reviews.llvm.org/D122179
2022-03-31 11:08:01 -04:00
Changpeng Fang 1711020c37 AMDGPU: Use isLiteralConstantLike to check whether the operand could ever be literal
Summary:
  To compute the size of a VALU/SALU instruction, we need to check whether an operand
could ever be literal. Previously isLiteralConstant was used, which missed cases
like global variables or external symbols. These misses lead to under-estimation of
the instruction size and branch offset, and thus incorrectly skip the necessary branch
relaxation when the branch offset is actually greater than what the branch bits can hold.
In this work, we use isLiteralConstantLike to check the operands. It maybe conservative,
but it is safe.

Reviewers: arsenm

Differential Revision: https://reviews.llvm.org/D122778
2022-03-31 08:06:31 -07:00
Louis Dionne 0a460416e6 [libc++] Install psutil on the macOS nodes 2022-03-31 10:52:58 -04:00
Nikita Popov 0721d7c4d8 [X86] Add test for PR54369 (NFC) 2022-03-31 16:45:05 +02:00
Carlo Marcelo Arenas Belón 81f5c6270c [compiler-rt] Implement __clear_cache on FreeBSD/powerpc
dd9173420f (Add clear_cache implementation for ppc64. Fix buffer to
meet ppc64 alignment., 2017-07-28), adds an implementation for
__builtin___clear_cache on powerpc64, which was promptly ammended to
also be used with big endian mode in f67036b62c (This ppc64 implementation
of clear_cache works for both big and little endian., 2017-08-02)

clang will use this implementation for it's builtin on FreeBSD and result
in an abort() in the cases where 32-bit generation was requested (ex in
macppc or when the big endian powerpc64 build was done with "-m32") and as
reported[1] recently with pcre2, but there is no reason why the same code
couldn't be used in those cases, so use instead the more generic identifier
for the PowerPC architecture.

While at it, update the comment to reflect that POWER8/9 have a 128 byte
wide cache line and so the code could instead use 64 byte windows instead
but that possible optimization has been punted for now.

[1] https://github.com/PhilipHazel/pcre2/issues/92

Reviewed By: jhibbits, #powerpc, MaskRay

Differential Revision: https://reviews.llvm.org/D122640
2022-03-31 14:19:26 +00:00
Arjun P 9615d717d1 [MLIR][Presburger] IntegerRelation::truncate: fix bug when truncating equalities
This was truncating inequalities instead of equalities.

Reviewed By: Groverkss

Differential Revision: https://reviews.llvm.org/D122811
2022-03-31 15:16:30 +01:00
Nikita Popov 33ac23e7cf [Float2Int] Avoid unnecessary lamdbas (NFC)
Instead of first creating a lambda for calculating the range,
then collecting the ranges for the operands, and then calling the
lambda on those ranges, we can first calculate the operand ranges
and then calculate the result directly in the switch.
2022-03-31 16:13:13 +02:00
Nikita Popov f66975555f [Float2Int] Extract calcRange() method (NFC)
This avoids the awkward "Abort" flag, because we can simply
early-return instead.
2022-03-31 16:13:13 +02:00
Arjun P d81fa76f3a [MLIR][Presburger] MultiAffineFunction:eliminateRedundantLocalId: fix bug where local offset was not considered
Previously, when updating the outputs matrix, the local offset was not being considered.

Reviewed By: Groverkss

Differential Revision: https://reviews.llvm.org/D122812
2022-03-31 15:11:55 +01:00
Florian Hahn 8378a71b6c
Recommit "[LV] Remove unneeded createHeaderBranch.(NFCI)"
This reverts the revert commit 2760cdc9c6.

This version pulls in the code to create the vector loop object in VPlan
from D121624.

This is needed because otherwise existing LoopInfo verification will
fail, as a loop block doesn't have in-loop successors now that we
do not replace the branch.

Now that we do not add new loops during skeleton construction, there's
also no need to verify LI there.
2022-03-31 14:48:32 +01:00