Commit Graph

6779 Commits

Author SHA1 Message Date
Nathan Chancellor ef58ae86ba
[RISCV] Fix mcount name
GCC's name for this symbol is _mcount, which the Linux kernel expects in
a few different place:

  $ echo 'int main(void) { return 0; }' | riscv32-linux-gcc -c -pg -o tmp.o -x c -

  $ llvm-objdump -dr tmp.o | grep mcount
                          0000000c:  R_RISCV_CALL _mcount

  $ echo 'int main(void) { return 0; }' | riscv64-linux-gcc -c -pg -o tmp.o -x c -

  $ llvm-objdump -dr tmp.o | grep mcount
                  000000000000000c:  R_RISCV_CALL _mcount

  $ echo 'int main(void) { return 0; }' | clang -c -pg -o tmp.o --target=riscv32-linux-gnu -x c -

  $ llvm-objdump -dr tmp.o | grep mcount
                          0000000a:  R_RISCV_CALL_PLT     mcount

  $ echo 'int main(void) { return 0; }' | clang -c -pg -o tmp.o --target=riscv64-linux-gnu -x c -

  $ llvm-objdump -dr tmp.o | grep mcount
                  000000000000000a:  R_RISCV_CALL_PLT     mcount

Set MCountName to "_mcount" in RISCVTargetInfo then prevent it from
getting overridden in certain OSTargetInfo constructors.

Reviewed By: MaskRay

Differential Revision: https://reviews.llvm.org/D98881

Signed-off-by: Nathan Chancellor <nathan@kernel.org>
2021-03-24 18:11:37 -07:00
Nemanja Ivanovic 4020932706 [PowerPC] Make altivec.h work with AIX which has no __int128
There are a number of functions in altivec.h that use
vector __int128 which isn't supported on AIX. Those functions
need to be guarded for targets that don't support the type.
Furthermore, the functions that produce quadword instructions
without using the type need a builtin. This patch adds the
macro guards to altivec.h using the __SIZEOF_INT128__ which
is only defined on targets that support the __int128 type.
2021-03-24 00:35:51 -05:00
Zakk Chen 88c2d4c8eb [RISCV][Clang] Add RVV Vector Indexed Load intrinsic functions.
Support Complex type transformer to define more complexity legal type.

Overall our downstream implementation there are only four instructions need to
use complex type transformer, it's not a common case.
I still feel using a string for prototypes is simple and clear.

Reviewed By: craig.topper

Differential Revision: https://reviews.llvm.org/D98848
2021-03-23 19:18:50 -07:00
Bruno Cardoso Lopes 431e3138a1 [CGAtomic] Lift stronger requirements on cmpxch and support acquire failure mode
- Fix `emitAtomicCmpXchgFailureSet` to support release/acquire (succ/fail) memory order.
- Remove stronger checks for cmpxch.

Effectively, this addresses http://wg21.link/p0418

Differential Revision: https://reviews.llvm.org/D98995
2021-03-23 16:45:37 -07:00
Nancy Wang f46c41febb [SystemZ][z/OS] fix lit test related to alignment
This patch is to fix lit test case failure relate to alignment, on z/OS, maximum alignment value for 64 bit mode is 16 and also fixed clang/test/Layout/itanium-union-bitfield.cpp, attribute ((aligned(4))) is needed for bit-field member in Union for z/OS because single bit-field has one byte alignment, this will make sure size and alignment will be correct value on z/OS.

Differential Revision: https://reviews.llvm.org/D98793
2021-03-23 13:15:19 -04:00
Zakk Chen 0bc1959f51 [RISCV][NFC] Fix RVV intrinsic tests.
1. Skip the temporary file
2. Test cc1 with -S to verify codegen work well. Add '-target-feature
   +m' because the backend requires it to calculate the vscaled size/offset.

Reviewed By: craig.topper

Differential Revision: https://reviews.llvm.org/D99082
2021-03-23 06:06:05 -07:00
Nemanja Ivanovic 2f782a796a [PowerPC] Add more missing overloads to altivec.h
Add overloads that perform subtraction on v1i128 that take and
produce vector unsigned char to avoid needing to use __int128.
The overloads are suffixed with _u128 and are needed for targets
where __int128 isn't supported (AIX).
2021-03-23 05:52:36 -05:00
Nemanja Ivanovic 54e4654f04 [PowerPC] Add more missing overloads to altivec.h
Add overloads that perform addition on v1i128 that take and produce
vector unsigned char to avoid needing to use __int128. The overloads
are suffixed with _u128 and are needed for targets where __int128
isn't supported (AIX).
2021-03-23 05:09:19 -05:00
Nemanja Ivanovic 10cc5bcd86 [PowerPC] Add more missing overloads to altivec.h
Add vec_permi as a synonym for vec_xxpermdi (but only for
doubleword vectors).
2021-03-22 23:09:41 -05:00
Nemanja Ivanovic b5e96e0ad6 [PowerPC] Add more missing overloads to altivec.h
Add vec_gbb as a synonym for vec_vgbbd but for doubleword vectors.
2021-03-22 22:25:28 -05:00
Nemanja Ivanovic d8e574c8e6 [PowerPC] Add more missing overloads to altivec.h
Add vec_cvf as a synonym for vec_doublee/vec_floate.
2021-03-22 22:08:43 -05:00
Zakk Chen 1ea07ee453 Revert "[RISCV][NFC] Fix RVV intrinsic tests."
This reverts commit ab082b582d.
2021-03-22 18:51:48 -07:00
Zakk Chen ab082b582d [RISCV][NFC] Fix RVV intrinsic tests.
1. Skip the temporary file
2. Test cc1 with -S to verify codegen work well. Add '-target-feature
   +m' because the backend requires it to calculate the vscaled size/offset.

Reviewed By: craig.topper

Differential Revision: https://reviews.llvm.org/D99082
2021-03-22 18:24:03 -07:00
Nemanja Ivanovic bef2cb9062 [PowerPC] Add more missing overloads to altivec.h
Add vec_ctd which is similar to vec_ctf except the return type is
vector double rather than vector float.
2021-03-22 20:23:07 -05:00
Bradley Smith 48f5a392cb [IR] Add vscale_range IR function attribute
This attribute represents the minimum and maximum values vscale can
take. For now this attribute is not hooked up to anything during
codegen, this will be added in the future when such codegen is
considered stable.

Additionally hook up the -msve-vector-bits=<x> clang option to emit this
attribute.

Differential Revision: https://reviews.llvm.org/D98030
2021-03-22 12:05:06 +00:00
Hongtao Yu fc1812a0ad [UniqueLinkageName] Use consistent checks when mangling symbo linkage name and debug linkage name.
C functions may be declared and defined in different prototypes like below. This patch unifies the checks for mangling names in symbol linkage name emission and debug linkage name emission so that the two names are consistent.

static int go(int);

static int go(a) int a;
{
  return a;
}

Test Plan:

Differential Revision: https://reviews.llvm.org/D98799
2021-03-18 22:11:16 -07:00
Thomas Lively f5764a8654 [WebAssembly] Finalize SIMD names and opcodes
Updates the names (e.g. widen => extend, saturate => sat) and opcodes of all
SIMD instructions to match the finalized SIMD spec. Deliberately does not change
the public interface in wasm_simd128.h yet; that will require more care.

Depends on D98466.

Differential Revision: https://reviews.llvm.org/D98676
2021-03-18 11:21:25 -07:00
Thomas Lively 2f2ae08da9 [WebAssembly] Remove experimental SIMD instructions
Removes the instruction definitions, intrinsics, and builtins for qfma/qfms,
signselect, and prefetch instructions, which were not included in the final
WebAssembly SIMD spec.

Depends on D98457.

Differential Revision: https://reviews.llvm.org/D98466
2021-03-18 11:21:24 -07:00
Thomas Lively 8638c897f4 [WebAssembly] Remove unimplemented-simd target feature
Now that the WebAssembly SIMD specification is finalized and engines are
generally up-to-date, there is no need for a separate target feature for gating
SIMD instructions that engines have not implemented. With this change,
v128.const is now enabled by default with the simd128 target feature.

Differential Revision: https://reviews.llvm.org/D98457
2021-03-18 10:23:12 -07:00
Mircea Trofin 92ccc6cb17 Reapply "[NPM][CGSCC] FunctionAnalysisManagerCGSCCProxy: do not clear immutable function passes"
This reverts commit 11b70b9e3a.

The bot failure was due to ArgumentPromotion deleting functions
without deleting their analyses. This was separately fixed in 4b1c807.
2021-03-18 09:44:34 -07:00
Thomas Preud'homme e5cd5b352f [test] Fix variable definition in acle_sve_ld1.sh
Clang test acle_sve_ld1.sh is missing the colon in one of the string
variable definition separating the variable name from the regex. This
leads the substitution block to be parsed as a numeric variable use.

Reviewed By: sdesmalen

Differential Revision: https://reviews.llvm.org/D98852
2021-03-18 12:15:45 +00:00
Zakk Chen be947aded0 [RISCV][Clang] Add RVV vle/vse intrinsic functions.
Add new field PermuteOperands to mapping different operand order between
C/C++ API and clang builtin.

Reviewed By: craig.topper, rogfer01

Authored-by: Roger Ferrer Ibanez <rofirrim@gmail.com>
Co-Authored-by: Hsiangkai Wang <kai.wang@sifive.com>
Co-Authored-by: Zakk Chen <zakk.chen@sifive.com>

Differential Revision: https://reviews.llvm.org/D98388
2021-03-17 20:31:25 -07:00
Zakk Chen 95c0125f2b [Clang][RISCV] Add rvv vsetvl and vsetvlmax intrinsic functions.
Reviewed By: craig.topper

Differential Revision: https://reviews.llvm.org/D96843
2021-03-17 20:26:06 -07:00
Alex Lorenz d672d5219a Revert "[CodeGenModule] Set dso_local for Mach-O GlobalValue"
This reverts commit 809a1e0ffd.

Mach-O doesn't support dso_local and this change broke XNU because of the use of dso_local.

Differential Revision: https://reviews.llvm.org/D98458
2021-03-17 17:27:41 -07:00
Thomas Preud'homme 2426b1fa66 [Test] Fix undef var in attr-speculative-load-hardening.c
Fix use of undefined variable in CHECK-NOT directive in clang test
CodeGen/attr-speculative-load-hardening.c.

Reviewed By: kristof.beyls

Differential Revision: https://reviews.llvm.org/D93347
2021-03-17 19:12:25 +00:00
Bradley Smith cf0da91ba5 [AArch64][SVE/NEON] Add support for FROUNDEVEN for both NEON and fixed length SVE
Previously NEON used a target specific intrinsic for frintn, given that
the FROUNDEVEN ISD node now exists, move over to that instead and add
codegen support for that node for both NEON and fixed length SVE.

Differential Revision: https://reviews.llvm.org/D98487
2021-03-17 11:41:22 +00:00
Bing1 Yu 320b72e9cd [X86][AMX] Rename amx-bf16 intrinsic according to correct naming convention
__tile_tdpbf16ps should be renamed with __tile_dpbf16ps

Reviewed By: pengfei

Differential Revision: https://reviews.llvm.org/D98685
2021-03-17 11:22:52 +08:00
Amy Huang f5352dd9da Emit inline implementation of __builtin__wmemchr on MSVCRT platforms.
The MSVC runtime library doesn't have a definition for wmemchr,
so provide an inline implementation.

Differential Revision: https://reviews.llvm.org/D98472
2021-03-15 15:30:55 -07:00
diggerlin d1f1bff81b [AIX][XCOFF] Fixed the test case which failed at aix OS because enable -mignore-xcoff-visibility by default.
Summary:

because we enable -mignore-xcoff-visibility by default when there is no -fvisibility option in the clang in AIX OS
it will cause some test case fail at aix os. in order to let the -mignore-xcoff-visibility to be disable, we need to add the -fvisibility=default for those test case.

Reviewers: hubert.reinterpretcast daltenty
Differential Revision: https://reviews.llvm.org/D98660
2021-03-15 17:33:02 -04:00
Jonas Paulsson 9cfd301ec8 [SystemZ] Test for isinf and isfinite in testFPKind().
Recognize BI__builtin_isinf and BI__builtin_isfinite (and a few other opcodes
for finite) in testFPKind() and handle with TDC.

Review: Ulrich Weigand.

Differential Revision: https://reviews.llvm.org/D97901
2021-03-15 15:02:39 -06:00
Stelios Ioannou ab86edbc88 [AArch64] Implement __rndr, __rndrrs intrinsics
This patch implements the __rndr and __rndrrs intrinsics to provide access to the random
number instructions introduced in Armv8.5-A. They are only defined for the AArch64
execution state and are available when __ARM_FEATURE_RNG is defined.

These intrinsics store the random number in their pointer argument and return a status
code if the generation succeeded. The difference between __rndr __rndrrs, is that the latter
intrinsic reseeds the random number generator.

The instructions write the NZCV flags indicating the success of the operation that we can
then read with a CSET.

[1] https://developer.arm.com/docs/101028/latest/data-processing-intrinsics
[2] https://bugs.llvm.org/show_bug.cgi?id=47838

Differential Revision: https://reviews.llvm.org/D98264

Change-Id: I8f92e7bf5b450e5da3e59943b53482edf0df6efc
2021-03-15 17:51:48 +00:00
Melanie Blower 33b1f3f42c [clang][patch] Solve PR49479, File scope fp pragma should propagate to functions nested in struct, and initialization expressions
Previously, the CurFPFeatures state was set to command line settings before
semantic analysis of the nested member functions and initialization
expressions, that's not correct, it should use the pragma state which
is in effect at the lexical position.

Reviewed By: Erich Keane, Aaron Ballman

Differential Revision: https://reviews.llvm.org/D98211
2021-03-15 12:15:20 -04:00
Thomas Preud'homme f60b35340f Stop traping on sNaN in __builtin_isinf
__builtin_isinf currently generates a floating-point compare operation
which triggers a trap when faced with a signaling NaN in StrictFP mode.
This commit uses integer operations instead to not generate any trap in
such a case.

Reviewed By: mibintc

Differential Revision: https://reviews.llvm.org/D97125
2021-03-15 15:38:08 +00:00
David Green 2b3c813143 [Clang][ARM] Reenable arm_acle.c test.
This test was apparently disabled in 6fcd4e080f, without any
sign of how it was going to be reenabled. This patch rewrites the test
to use update_cc_test_checks, with midend optimizations other that
mem2reg disabled.

The first attempt of this patch in 5ae949a927 failed on bots even
though it worked locally.  I've attempted to adjust the RUN lines and
made the test AArch64/ARM specific.

Differential Revision: https://reviews.llvm.org/D98510
2021-03-14 10:59:24 +00:00
Nico Weber d7b7e2026b Revert "[Clang][ARM] Reenable arm_acle.c test."
This reverts commit 5ae949a927.
Test fails everywhere.
2021-03-12 14:37:37 -05:00
David Green 5ae949a927 [Clang][ARM] Reenable arm_acle.c test.
This test was apparently disabled in 6fcd4e080f, without any
sign of how it was going to be reenabled. This patch rewrites the test
to use update_cc_test_checks, with midend optimizations other that
mem2reg disabled.
2021-03-12 19:21:21 +00:00
Nemanja Ivanovic b5fae4b9b2 [PowerPC] Add more missing overloads to altivec.h
We are missing more predicate forms for 'vector double' and some
tests. This adds the missing overloads and completes the set of
test cases for them.
2021-03-12 10:51:57 -06:00
Sriraman Tallam cdb42a4cc4 Disable unique linkage suffixes ifor global vars until demanglers can be fixed.
D96109 added support for unique internal linkage names for both internal
linkage functions and global variables. There was a lot of discussion on how to
get the demangling right for functions but I completely missed the point that
demanglers do not support suffixes for global vars. For example:

$ c++filt _ZL3foo
foo
$ c++filt _ZL3foo.uniq.123
_ZL3foo.uniq.123

The demangling for functions works as expected.

I am not sure of the impact of this. I don't understand how debuggers and other
tools depend on the correctness of global variable demangling so I am
pre-emptively disabling it until we can get the demangling support added.

Importantly, uniquefying global variables is not needed right now as we do not
do profile attribution to global vars based on sampling. It was added for
completeness and so this feature is not exactly missed.

Differential Revision: https://reviews.llvm.org/D98392
2021-03-11 20:59:30 -08:00
Mircea Trofin 11b70b9e3a Revert "[NPM][CGSCC] FunctionAnalysisManagerCGSCCProxy: do not clear immutable function passes"
This reverts commit 5eaeb0fa67.

It appears there are analyses that assume clearing - example:
https://lab.llvm.org/buildbot#builders/36/builds/5964
2021-03-11 18:31:19 -08:00
Mircea Trofin 5eaeb0fa67 [NPM][CGSCC] FunctionAnalysisManagerCGSCCProxy: do not clear immutable function passes
Check with the analysis result by calling invalidate instead of clear on
the analysis manager.

Differential Revision: https://reviews.llvm.org/D98440
2021-03-11 18:15:28 -08:00
Florian Hahn c92ec0dd92
[Matrix] Add support for matrix-by-scalar division.
This patch extends the matrix spec to allow matrix-by-scalar division.

Originally support for `/` was left out to avoid ambiguity for the
matrix-matrix version of `/`, which could either be elementwise or
specified as matrix multiplication M1 * (1/M2).

For the matrix-scalar version, no ambiguity exists; `*` is also
an elementwise operation in that case. Matrix-by-scalar division
is commonly supported by systems including Matlab, Mathematica
or NumPy.

Reviewed By: rjmccall

Differential Revision: https://reviews.llvm.org/D97857
2021-03-11 22:21:23 +00:00
Zakk Chen d6a0560bf2 [Clang][RISCV] Add custom TableGen backend for riscv-vector intrinsics.
Demonstrate how to generate vadd/vfadd intrinsic functions

1. add -gen-riscv-vector-builtins for clang builtins.
2. add -gen-riscv-vector-builtin-codegen for clang codegen.
3. add -gen-riscv-vector-header for riscv_vector.h. It also generates
ifdef directives with extension checking, base on D94403.
4. add -gen-riscv-vector-generic-header for riscv_vector_generic.h.
Generate overloading version Header for generic api.
https://github.com/riscv/rvv-intrinsic-doc/blob/master/rvv-intrinsic-rfc.md#c11-generic-interface
5. update tblgen doc for riscv related options.

riscv_vector.td also defines some unused type transformers for vadd,
because I think it could demonstrate how tranfer type work and we need
them for the whole intrinsic functions implementation in the future.

Authored-by: Roger Ferrer Ibanez <rofirrim@gmail.com>
Co-Authored-by: Zakk Chen <zakk.chen@sifive.com>

Reviewed By: jrtc27, craig.topper, HsiangKai, Jim, Paul-C-Anagnostopoulos

Differential Revision: https://reviews.llvm.org/D95016
2021-03-10 18:43:43 -08:00
Jingu Kang 25951c5ab8 [AArch64] Add missing intrinsics for scalar FP rounding
Differential Revision: https://reviews.llvm.org/D98269
2021-03-10 13:22:29 +00:00
Fangrui Song b4948c27d2 Revert D97743 "Define __GCC_HAVE_DWARF2_CFI_ASM if applicable"
This reverts commit c11ff4bbad & df67d35269.

Trying to make the change to the driver to avoid round-trip issues.
2021-03-09 12:14:12 -08:00
Fangrui Song df67d35269 [test] Fix debug-info-macro.c 2021-03-09 12:04:51 -08:00
diggerlin 46d4d1fea4 [AIX] do not emit visibility attribute into IR when there is -mignore-xcoff-visibility
SUMMARY:

n the patch https://reviews.llvm.org/D87451 "add new option -mignore-xcoff-visibility"
we did as "The option -mignore-xcoff-visibility has no effect on visibility attribute when compile with -emit-llvm option to generated LLVM IR."

in these patch we let -mignore-xcoff-visibility effect on generating IR too. the new feature only work on AIX OS

Reviewer: Jason Liu,

Differential Revision: https://reviews.llvm.org/D89986
2021-03-09 10:38:00 -05:00
Tomas Matheson 7e5cea5b50 [Clang][Sema] Warn when function argument is less aligned than parameter
See https://bugs.llvm.org/show_bug.cgi?id=42154.

GCC's __attribute__((align)) can reduce the alignment of a type when applied to
a typedef.  However, functions which take a pointer or reference to the
original type are compiled assuming the original alignment.  Therefore when any
such function is passed an object of the new, less-aligned type, an alignment
fault can occur.  In particular, this applies to the constructor, which is
defined for the original type and called for the less-aligned object.

This change adds a warning whenever an pointer or reference to an object is
passed to a function that was defined for a more-aligned type.

The calls to ASTContext::getTypeAlignInChars seem change the order in which
record layouts are evaluated, which caused changes to the output of
-fdump-record-layouts. As such some tests needed to be updated:

  * Use CHECK-LABEL rather than counting the number of "Dumping AST Record
    Layout" headers.

  * Check for end of line in labels, so that struct B1 doesn't match struct B
    etc.

  * Add --strict-whitespace, since the whitespace shows meaningful structure.

  * The order in which record layouts are printed has changed in some cases.

  * clang-format for regions changed

Differential Revision: https://reviews.llvm.org/D97187
2021-03-09 10:37:32 +00:00
Ahsan Saghir acce401068 [PowerPC] Change target data layout for 16-byte stack alignment
This changes the target data layout to make stack align to 16 bytes
on Power10. Before this change, stack was being aligned to 32 bytes.

Reviewed By: #powerpc, nemanjai

Differential Revision: https://reviews.llvm.org/D96265
2021-03-08 08:13:08 -06:00
Saurabh Jha 63851a701e
[Matrix] Implement += and -= for MatrixType.
Make sure CompLHSTy is set correctly for += and -= and matrix type
operands.

Bugzilla ticket is here https://bugs.llvm.org/show_bug.cgi?id=46164

Patch by Saurabh Jha <saurabh.jhaa@gmail.com>

Reviewed By: fhahn

Differential Revision: https://reviews.llvm.org/D98075
2021-03-08 09:32:11 +00:00
Nemanja Ivanovic f4ad7a1a15 [PowerPC] Add missing double precision vec_all overloads to altivec.h
We somehow missed vec_all_nlt, vec_all_nle and vec_all_numeric
overloads for double precision vectors when VSX is enabled.
2021-03-05 18:42:12 -06:00
Sriraman Tallam 78d0e91865 Refactor -funique-internal-linakge-names implementation.
The option -funique-internal-linkage-names was added in D73307 and D78243 as a
LLVM early pass to insert a unique suffix to internal linkage functions and
vars. The unique suffix was the hash of the module path. However, we found
that this can be done more cleanly in clang early and the fixes that need to
be done later can be completely avoided. The fixes in particular are trying
to modify the DW_AT_linkage_name and finding the right place to insert the
pass.

This patch ressurects the original implementation proposed in D73307 which
was reviewed and then ditched in favor of the pass based approach.

Differential Revision: https://reviews.llvm.org/D96109
2021-03-05 13:32:17 -08:00
Chen Zheng afa76fe67a [XCOFF][DWARF] set default DWARF version to 3.
Reviewed By: jsji

Differential Revision: https://reviews.llvm.org/D98010
2021-03-05 09:21:57 -05:00
Jingu Kang 9b302513f6 [AArch64] Add missing intrinsics for vrnd 2021-03-05 11:26:12 +00:00
Gui Andrade 10264a1b21 Introduce noundef attribute at call sites for stricter poison analysis
This change adds a new IR noundef attribute, which denotes when a function call argument or return val may never contain uninitialized bits.

In MemorySanitizer, this attribute enables optimizations which decrease instrumented code size by up to 17% (measured with an instrumented build of clang) . I'll introduce the change allowing msan to take advantage of this information in a separate patch.

Differential Revision: https://reviews.llvm.org/D81678
2021-03-04 12:15:12 -08:00
Thomas Preud'homme 52bfe6605a Add __builtin_isnan(__fp16) testcase
Reviewed By: rjmccall

Differential Revision: https://reviews.llvm.org/D97777
2021-03-04 13:03:48 +00:00
Thomas Preud'homme 6d6e7132f9 Revert "Add __builtin_isnan(__fp16) testcase"
This reverts commit e77b5c40d5 because it
fails without 1b6eb56aa0.
2021-03-04 12:18:03 +00:00
Thomas Preud'homme b7aeece47c Revert "Stop traping on sNaN in __builtin_isinf"
This reverts commit 1b6eb56aa0 because the
invert logic for isfinite is incorrect.
2021-03-04 12:07:35 +00:00
Wang, Pengfei e7e67c930a Add Windows ehcont section support (/guard:ehcont).
Add option /guard:ehcont

Reviewed By: rnk

Differential Revision: https://reviews.llvm.org/D96709
2021-03-04 11:47:29 +08:00
Fangrui Song 584cb67d2d [IRSymTab] Set FB_used on llvm.compiler.used symbols
IR symbol table does not parse inline asm. A symbol only referenced by inline
asm is not in the IR symbol table, so LTO does not know that the definition (in
another translation unit) is referenced and may internalize it, even if that
definition has `__attribute__((used))` (which lowers to `llvm.compiler.used` on
ELF targets since D97446).

```
// cabac.c
__attribute__((used)) const uint8_t ff_h264_cabac_tables[...] = {...};

// h264_cabac.c
  asm("lea ff_h264_cabac_tables(%rip), %0" : ...);
```

`__attribute__((used))` is the recommended way to tell the compiler there may
be inline asm references, so the usage is perfectly fine. This patch
conservatively sets the `FB_used` bit on `llvm.compiler.used` symbols to work
around the IR symbol table limitation. Note: before D97446, Clang never emitted
symbols in the `llvm.compiler.used` list, so this change does not punish any
Clang emitted global object.

Without the patch, `ff_h264_cabac_tables` may be assigned to a non-external
partition and get internalized. Then we will get a linker error because the
`cabac.c` definition is not exposed.

Differential Revision: https://reviews.llvm.org/D97755
2021-03-03 16:22:30 -08:00
JinGu Kang 394a4d0433 [AArch64] Add missing intrinsics for vcls
Differential Revision: https://reviews.llvm.org/D97775
2021-03-03 10:17:56 +00:00
Thomas Preud'homme e77b5c40d5 Add __builtin_isnan(__fp16) testcase
Reviewed By: rjmccall

Differential Revision: https://reviews.llvm.org/D97777
2021-03-02 21:01:51 +00:00
Thomas Preud'homme 1b6eb56aa0 Stop traping on sNaN in __builtin_isinf
__builtin_isinf currently generates a floating-point compare operation
which triggers a trap when faced with a signaling NaN in StrictFP mode.
This commit uses integer operations instead to not generate any trap in
such a case.

Reviewed By: mibintc

Differential Revision: https://reviews.llvm.org/D97125
2021-03-02 15:54:56 +00:00
Nemanja Ivanovic 1ff93618e5 [PowerPC] Add missing overloads of vec_promote to altivec.h
The VSX-only overloads (for 8-byte element vectors) are missing.
Add the missing overloads and convert element numbering to
modulo arithmetic to match GCC and XLC.
2021-03-01 21:40:30 -06:00
Fangrui Song d942a82a07 Make -f[no-]split-dwarf-inlining CC1 default align with driver default (no inlining)
This makes CC1 and driver defaults consistent.
In addition, for more common cases (-g is specified without -gsplit-dwarf), users will not see -fno-split-dwarf-inlining in CC1 options.

Verified that the below is still true:

* `clang -g` => `splitDebugInlining: false` in DICompileUnit
* `clang -g -gsplit-dwarf` => `splitDebugInlining: false` in DICompileUnit
* `clang -g -gsplit-dwarf -fsplit-dwarf-inlining` => no `splitDebugInlining: false`

Reviewed By: dblaikie

Differential Revision: https://reviews.llvm.org/D97706
2021-03-01 10:55:19 -08:00
Yonghong Song 283db5f083 BPF: fix enum value 0 issue for __builtin_preserve_enum_value()
Lorenz Bauer reported that the following code will have
compilation error for bpf target:
    enum e { TWO };
    bpf_core_enum_value_exists(enum e, TWO);
The clang emitted the following error message:
    __builtin_preserve_enum_value argument 1 invalid

In SemaChecking, an expression like "*(enum NAME)1" will have
cast kind CK_IntegralToPointer, but "*(enum NAME)0" will have
cast kind CK_NullToPointer. Current implementation only permits
CK_IntegralToPointer, missing enum value 0 case.

This patch permits CK_NullToPointer cast kind and
the above test case can pass now.

Differential Revision: https://reviews.llvm.org/D97659
2021-03-01 10:23:24 -08:00
Sean Fertile 3f40dbbbc7 [PowerPC][AIX] Enable passing vectors in variadic functions.
Differential Revision: https://reviews.llvm.org/D97474
2021-03-01 13:08:28 -05:00
Arthur Eubanks 040c1b49d7 Move EntryExitInstrumentation pass location
This seems to be more of a Clang thing rather than a generic LLVM thing,
so this moves it out of LLVM pipelines and as Clang extension hooks into
LLVM pipelines.

Move the post-inline EEInstrumentation out of the backend pipeline and
into a late pass, similar to other sanitizer passes. It doesn't fit
into the codegen pipeline.

Also fix up EntryExitInstrumentation not running at -O0 under the new
PM. PR49143

Reviewed By: hans

Differential Revision: https://reviews.llvm.org/D97608
2021-03-01 10:08:10 -08:00
Fangrui Song a0c1cd642d [test] Add -triple x86_64 to attr-retain.c 2021-02-26 17:26:26 -08:00
Fangrui Song 8afdacba9d Add GNU attribute 'retain'
For ELF targets, GCC 11 will set SHF_GNU_RETAIN on the section of a
`__attribute__((retain))` function/variable to prevent linker garbage
collection. (See AttrDocs.td for the linker support).

This patch adds `retain` functions/variables to the `llvm.used` list, which has
the desired linker GC semantics. Note: `retain` does not imply `used`,
so an unused function/variable can be dropped by Sema.

Before 'retain' was introduced, previous ELF solutions require inline asm or
linker tricks, e.g.  `asm volatile(".reloc 0, R_X86_64_NONE, target");`
(architecture dependent) or define a non-local symbol in the section and use
`ld -u`. There was no elegant source-level solution.

With D97448, `__attribute__((retain))` will set `SHF_GNU_RETAIN` on ELF targets.

Differential Revision: https://reviews.llvm.org/D97447
2021-02-26 16:37:50 -08:00
Fangrui Song 28cb620321 Change some addUsedGlobal to addUsedOrCompilerUsedGlobal
An global value in the `llvm.used` list does not have GC root semantics on ELF targets.
This will be changed in a subsequent backend patch.

Change some `llvm.used` in the ELF code path to use `llvm.compiler.used` to
prevent undesired GC root semantics.

Change one extern "C" alias (due to `__attribute__((used))` in extern "C") to use `llvm.compiler.used` on all targets.

GNU ld has a rule "`__start_/__stop_` references from a live input section retain the associated C identifier name sections",
which LLD may drop entirely (currently refined to exclude SHF_LINK_ORDER/SHF_GROUP) in a future release (the rule makes it clumsy to GC metadata sections; D96914 added a way to try the potential future behavior).
For `llvm.used` global values defined in a C identifier name section, keep using `llvm.used` so that
the future LLD change will not affect them.

rnk kindly categorized the changes:
```
ObjC/blocks: this wants GC root semantics, since ObjC mainly runs on Mac.
MS C++ ABI stuff: wants GC root semantics, no change
OpenMP: unsure, but GC root semantics probably don't hurt
CodeGenModule: affected in this patch to *not* use GC root semantics so that __attribute__((used)) behavior remains the same on ELF, plus two other minor use cases that don't want GC semantics
Coverage: Probably want GC root semantics
CGExpr.cpp: refers to LTO, wants GC root
CGDeclCXX.cpp: one is MS ABI specific, so yes GC root, one is some other C++ init functionality, which should form GC roots (C++ initializers can have side effects and must run)
CGDecl.cpp: Changed in this patch for __attribute__((used))
```

Differential Revision: https://reviews.llvm.org/D97446
2021-02-26 10:42:07 -08:00
Petr Hosek 8459b8ef39 [Driver] Rename -fprofile-{prefix-map,compilation-dir} to -fcoverage-{prefix-map,compilation-dir}
These flags affect coverage mapping (-fcoverage-mapping), not
-fprofile-[instr-]generate so it makes more sense to use the
-fcoverage-* prefix.

Differential Revision: https://reviews.llvm.org/D97434
2021-02-25 21:40:12 -08:00
Dan Liew 7b1d2a2891 [NFC] Switch to auto marshalling infrastructure for `-fsanitize-address-destructor-kind=` flag.
This change simplifies `clang/lib/Frontend/CompilerInvocation.cpp`
because we no longer need to manually parse the flag and set codegen
options in the frontend. However, we still need to manually parse the
flag in the driver because:

* The marshalling infrastructure doesn't operate there.
* We need to do some platform specific checks in the driver
  that will likely never be supported by any kind of marshalling
  infrastructure.

rdar://71609176

Differential Revision: https://reviews.llvm.org/D97327
2021-02-25 13:24:50 -08:00
Dan Liew 5d64dd8e3c [Clang][ASan] Introduce `-fsanitize-address-destructor-kind=` driver & frontend option.
The new `-fsanitize-address-destructor-kind=` option allows control over how module
destructors are emitted by ASan.

The new option is consumed by both the driver and the frontend and is propagated into
codegen options by the frontend.

Both the legacy and new pass manager code have been updated to consume the new option
from the codegen options.

It would be nice if the new utility functions (`AsanDtorKindToString` and
`AsanDtorKindFromString`) could live in LLVM instead of Clang so they could be
consumed by other language frontends. Unfortunately that doesn't work because
the clang driver doesn't link against the LLVM instrumentation library.

rdar://71609176

Differential Revision: https://reviews.llvm.org/D96572
2021-02-25 12:02:21 -08:00
Liu, Chen3 4bc7c8631a [X86] Support amx-bf16 intrinsic.
Adding support for intrinsics of AMX-BF16.
This patch alse fix a bug that AMX-INT8 instructions will be selected with wrong
predicate.

Differential Revision: https://reviews.llvm.org/D97358
2021-02-25 09:06:48 +08:00
Dávid Bolvanský 053dc95839 Reduce the number of attributes attached to each function
Patch takes advantage of the implicit default behavior to reduce the number of attributes, which in turns reduces compilation time.

Reviewed By: serge-sans-paille

Differential Revision: https://reviews.llvm.org/D97116
2021-02-24 07:08:44 +01:00
Hsiangkai Wang 1a35a1b074 [RISCV] Add vadd with mask and without mask builtin.
Demonstrate how to add RISC-V V builtins and lower them to IR intrinsics for V extension.

Authored-by: Roger Ferrer Ibanez <rofirrim@gmail.com>
Co-Authored-by: Hsiangkai Wang <kai.wang@sifive.com>

Differential Revision: https://reviews.llvm.org/D93446
2021-02-24 07:57:31 +08:00
Liu, Chen3 f8b9035aae [X86] Support amx-int8 intrinsic.
Adding support for intrinsics of TDPBSUD/TDPBUSD/TDPBUUD.

Differential Revision: https://reviews.llvm.org/D97259
2021-02-23 17:08:05 +08:00
Ryan Santhiraraja 2c25efcbd3 [AArch64] Adding SHA3 Intrinsics support
This patch adds the following SHA3 Intrinsics:
        vsha512hq_u64,
        vsha512h2q_u64,
        vsha512su0q_u64,
        vsha512su1q_u64
        veor3q_u8
        veor3q_u16
        veor3q_u32
        veor3q_u64
        veor3q_s8
        veor3q_s16
        veor3q_s32
        veor3q_s64
        vrax1q_u64
        vxarq_u64
        vbcaxq_u8
        vbcaxq_u16
        vbcaxq_u32
        vbcaxq_u64
        vbcaxq_s8
        vbcaxq_s16
        vbcaxq_s32
        vbcaxq_s64

    Note need to include +sha3 and +crypto when building from the front-end

Reviewed By: DavidSpickett

Differential Revision: https://reviews.llvm.org/D96381
2021-02-22 12:09:20 +00:00
Dávid Bolvanský ee51c42e00 Reduce the number of attributes attached to each function
This takes advantage of the implicit default behavior to reduce the number of
attributes.
2021-02-20 06:57:47 +01:00
Dávid Bolvanský cd54c57919 Reland "[Libcalls, Attrs] Annotate libcalls with noundef"
Fixed Clang tests.
2021-02-20 06:18:48 +01:00
Christopher Tetreault 55448ab540 [AArch64] Adding Neon Polynomial vadd Intrinsics
This patch adds the following intrinsics:
            vadd_p8
            vadd_p16
            vadd_p64
            vaddq_p8
            vaddq_p16
            vaddq_p64
            vaddq_p128

Reviewed By: t.p.northover, DavidSpickett, ctetreau

Differential Revision: https://reviews.llvm.org/D96825
2021-02-19 14:48:12 -08:00
Artem Belevich 1a368ae3b7 [CUDA] fix builtin constraints for PTX 7.2
This fixes build issues w/ CUDA-11 introduced by https://reviews.llvm.org/D95974

Reviewed By: yaxunl

Differential Revision: https://reviews.llvm.org/D97009
2021-02-19 09:57:21 -08:00
Nikita Popov 71a8e4e7d6 [MemCopyOpt] Enable MemorySSA by default
This enables use of MemorySSA instead of MemDep in MemCpyOpt. To
allow this without significant compile-time impact, the MemCpyOpt
pass is moved directly before DSE (in the cases where this was not
already the case), which allows us to reuse the existing MemorySSA
analysis.

Unlike the MemDep-based implementation, the MemorySSA-based MemCpyOpt
can also perform simple optimizations across basic blocks.

Differential Revision: https://reviews.llvm.org/D94376
2021-02-19 18:06:25 +01:00
Petr Hosek 5fbd1a333a [Coverage] Store compilation dir separately in coverage mapping
We currently always store absolute filenames in coverage mapping.  This
is problematic for several reasons. It poses a problem for distributed
compilation as source location might vary across machines.  We are also
duplicating the path prefix potentially wasting space.

This change modifies how we store filenames in coverage mapping. Rather
than absolute paths, it stores the compilation directory and file paths
as given to the compiler, either relative or absolute. Later when
reading the coverage mapping information, we recombine relative paths
with the working directory. This approach is similar to handling
ofDW_AT_comp_dir in DWARF.

Finally, we also provide a new option, -fprofile-compilation-dir akin
to -fdebug-compilation-dir which can be used to manually override the
compilation directory which is useful in distributed compilation cases.

Differential Revision: https://reviews.llvm.org/D95753
2021-02-18 14:34:39 -08:00
Petr Hosek fbf8b957fd Revert "[Coverage] Store compilation dir separately in coverage mapping"
This reverts commit 97ec8fa5bb since
the test is failing on some bots.
2021-02-18 12:50:24 -08:00
Pengxuan Zheng 0ec32f1326 Revert "[AArch64] Adding Neon Polynomial vadd Intrinsics"
Revert the patch due to buildbot failures.

This reverts commit d9645059c5.
2021-02-18 12:38:16 -08:00
Petr Hosek 97ec8fa5bb [Coverage] Store compilation dir separately in coverage mapping
We currently always store absolute filenames in coverage mapping.  This
is problematic for several reasons. It poses a problem for distributed
compilation as source location might vary across machines.  We are also
duplicating the path prefix potentially wasting space.

This change modifies how we store filenames in coverage mapping. Rather
than absolute paths, it stores the compilation directory and file paths
as given to the compiler, either relative or absolute. Later when
reading the coverage mapping information, we recombine relative paths
with the working directory. This approach is similar to handling
ofDW_AT_comp_dir in DWARF.

Finally, we also provide a new option, -fprofile-compilation-dir akin
to -fdebug-compilation-dir which can be used to manually override the
compilation directory which is useful in distributed compilation cases.

Differential Revision: https://reviews.llvm.org/D95753
2021-02-18 12:27:42 -08:00
Pengxuan Zheng d9645059c5 [AArch64] Adding Neon Polynomial vadd Intrinsics
This patch adds the following intrinsics:
            vadd_p8
            vadd_p16
            vadd_p64
            vaddq_p8
            vaddq_p16
            vaddq_p64
            vaddq_p128

Reviewed By: t.p.northover, DavidSpickett

Differential Revision: https://reviews.llvm.org/D96825
2021-02-18 11:33:24 -08:00
Jonas Paulsson e57bd1ff4f [CFE, SystemZ] New target hook testFPKind() for checks of FP values.
The recent commit 00a6254 "Stop traping on sNaN in builtin_isnan" changed the
lowering in constrained FP mode of builtin_isnan from an FP comparison to
integer operations to avoid trapping.

SystemZ has a special instruction "Test Data Class" which is the preferred
way to do this check. This patch adds a new target hook "testFPKind()" that
lets SystemZ emit the s390_tdc intrinsic instead.

testFPKind() takes the BuiltinID as an argument and is expected to soon
handle more opcodes than just 'builtin_isnan'.

Review: Thomas Preud'homme, Ulrich Weigand
Differential Revision: https://reviews.llvm.org/D96568
2021-02-18 12:36:46 -06:00
Jeroen Dobbelaere 46757ccb49 [clang] functions with the 'const' or 'pure' attribute must always return.
As described in
* https://gcc.gnu.org/onlinedocs/gcc/Common-Function-Attributes.html#index-pure-function-attribute
* https://gcc.gnu.org/onlinedocs/gcc/Common-Function-Attributes.html#index-const-function-attribute

An `__attribute__((pure))` function must always return, as well as an `__attribute__((const))` function.

Reviewed By: jdoerfert

Differential Revision: https://reviews.llvm.org/D96960
2021-02-18 17:29:46 +01:00
Hsiangkai Wang 766ee1096f [Clang][RISCV] Define RISC-V V builtin types
Add the types for the RISC-V V extension builtins.

These types will be used by the RISC-V V intrinsics which require
types of the form <vscale x 1 x i64>(LMUL=1 element size=64) or
<vscale x 4 x i32>(LMUL=2 element size=32), etc. The vector_size
attribute does not work for us as it doesn't create a scalable
vector type. We want these types to be opaque and have no operators
defined for them. We want them to be sizeless. This makes them
similar to the ARM SVE builtin types. But we will have quite a bit
more types. This patch adds around 60. Later patches will add
another 230 or so types representing tuples of these types similar
to the x2/x3/x4 types in ARM SVE. But with extra complexity that
these types are combined with the LMUL concept that is unique to
RISCV.

For more background see this RFC
http://lists.llvm.org/pipermail/llvm-dev/2020-October/145850.html

Authored-by: Roger Ferrer Ibanez <roger.ferrer@bsc.es>
Co-Authored-by: Hsiangkai Wang <kai.wang@sifive.com>

Differential Revision: https://reviews.llvm.org/D92715
2021-02-18 10:17:31 +08:00
Sriraman Tallam e741916330 Basic block sections should enable not function sections implicitly.
Basic block sections enables function sections implicitly, this is not needed
and is inefficient with "=list" option.

We had basic block sections enable function sections implicitly in clang. This
is particularly inefficient with "=list" option as it places functions that do
not have any basic block sections in separate sections. This causes unnecessary
object file overhead for large applications.

This patch disables this implicit behavior. It only creates function sections
for those functions that require basic block sections.

This patch is the second of two patches and this patch removes the implicit
enabling of function sections with basic block sections in clang.

Differential Revision: https://reviews.llvm.org/D93876
2021-02-17 12:37:50 -08:00
Igor Kudrin aa84289629 [DebugInfo] Keep the DWARF64 flag in the module metadata
This allows the option to affect the LTO output. Module::Max helps to
generate debug info for all modules in the same format.

Differential Revision: https://reviews.llvm.org/D96597
2021-02-17 17:03:34 +07:00
serge-sans-paille 3c8bf29f14
Reduce the number of attributes attached to each function
This takes advantage of the implicit default behavior to reduce the number of
attributes, which in turns reduces compilation time. I've observed -3% in
instruction count when compiling sqlite3 amalgamation with -O0

Differential Revision: https://reviews.llvm.org/D96400
2021-02-16 16:19:54 +01:00
Wang, Pengfei 61da20575d [X86] Convert fmin/fmax _mm_reduce_* intrinsics to emit llvm.reduction intrinsics (PR47506)
This is a follow up of D92940.

We have successfully converted fadd/fmul _mm_reduce_* intrinsics to
llvm.reduction + reassoc flag. We can do the same approach for fmin/fmax
too, i.e. llvm.reduction + nnan flag.

Reviewed By: spatel

Differential Revision: https://reviews.llvm.org/D93179
2021-02-15 08:52:06 +08:00
Jonas Paulsson b3ac5b84cd [SystemZ] Fix vecintrin.h to not emit alignment hints in vec_xl/vec_xst.
vec_xl() and vec_xst() should not emit alignment hints since they take a
scalar pointer and also add a byte offset if passed.

This patch uses memcpy to achieve the desired result.

Review: Ulrich Weigand

Differential Revision: https://reviews.llvm.org/D96471
2021-02-12 18:26:36 -06:00
Florian Hahn 51bf4c0e6d [clang] Add -ffinite-loops & -fno-finite-loops options.
This patch adds 2 new options to control when Clang adds `mustprogress`:

  1. -ffinite-loops: assume all loops are finite; mustprogress is added
     to all loops, regardless of the selected language standard.
  2. -fno-finite-loops: assume no loop is finite; mustprogress is not
     added to any loop or function. We could add mustprogress to
     functions without loops, but we would have to detect that in Clang,
     which is probably not worth it.

Reviewed By: jdoerfert

Differential Revision: https://reviews.llvm.org/D96419
2021-02-12 19:25:49 +00:00
Florian Hahn fb4d8fe807 [clang] Update mustprogress tests.
This unifies the positive and negative tests in a single file and
manually adjusts the check lines to check for differences surgically.
2021-02-12 16:53:51 +00:00
James Y Knight 8043d5a964 NFC: update clang tests to check ordering and alignment for atomicrmw/cmpxchg.
The ability to specify alignment was recently added, and it's an
important property which we should ensure is set as expected by
Clang. (Especially before making further changes to Clang's code in
this area.) But, because it's on the end of the lines, the existing
tests all ignore it.

Therefore, update all the tests to also verify the expected alignment
for atomicrmw and cmpxchg. While I was in there, I also updated uses
of 'load atomic' and 'store atomic', and added the memory ordering,
where that was missing.
2021-02-11 17:35:09 -05:00
Pengxuan Zheng 61cca0f2e5 [AArch64] Adding Neon Sm3 & Sm4 Intrinsics
This adds SM3 and SM4 Intrinsics support for AArch64, specifically:
        vsm3ss1q_u32
        vsm3tt1aq_u32
        vsm3tt1bq_u32
        vsm3tt2aq_u32
        vsm3tt2bq_u32
        vsm3partw1q_u32
        vsm3partw2q_u32
        vsm4eq_u32
        vsm4ekeyq_u32

Reviewed By: labrinea

Differential Revision: https://reviews.llvm.org/D95655
2021-02-11 14:20:20 -08:00
Douglas Yung 7b4832648a NFCI. With the move to the new pass manager by default, sanitize-coverage.c is now passing on ARM.
This change removes the XFAIL from the original test and duplicates the test into sanitize-coverage-old-pm.c
which uses the old pass manager and has the corresponding XFAIL.

This should fix the XPASS from this and similar runs:
http://lab.llvm.org:8011/#/builders/60/builds/1875
2021-02-11 13:18:18 -08:00
Paul Robinson 5ea2d4fa48 Avoid conflicts between debug-info and pseudo-probe profiling
After D93264, using both -fdebug-info-for-profiling and
-fpseudo-probe-for-profiling will cause the compiler to crash.
Diagnose these conflicting options in the driver.

Also, the existing CodeGen test was using the driver when it should be
running cc1.

Differential Revision: https://reviews.llvm.org/D96354
2021-02-10 07:09:18 -08:00
Wang, Pengfei dd2460ed5d [X86] Always assign reassoc flag for intrinsics *reduce_add/mul_ps/pd.
Intrinsics *reduce_add/mul_ps/pd have assumption that the elements in
the vector are reassociable. So we need to always assign the reassoc
flag when we call _mm_reduce_* intrinsics.

Reviewed By: spatel

Differential Revision: https://reviews.llvm.org/D96231
2021-02-09 21:14:06 +08:00
Petr Hosek 9fd9b5a9c9 Don't emit coverage mapping for excluded functions
When a function or a file is excluded using -fprofile-list= option,
don't emit coverage mapping as doing so confuses users since those
functions would always have zero count. This also reduces the binary
size considerably in cases where only a few functions or files are
being instrumented.

Differential Revision: https://reviews.llvm.org/D96000
2021-02-05 13:03:57 -08:00
Thomas Preud'homme 00a62547da Stop traping on sNaN in __builtin_isnan
__builtin_isnan currently generates a floating-point compare operation
which triggers a trap when faced with a signaling NaN in StrictFP mode.
This commit uses integer operations instead to not generate any trap in
such a case.

Reviewed By: kpn

Differential Revision: https://reviews.llvm.org/D95948
2021-02-05 18:28:48 +00:00
Krzysztof Parzyszek bc097f645e [Hexagon] Add clang builtin definitions for Hexagon V68 2021-02-04 09:54:52 -06:00
Kevin P. Neal 81b69879c9 [FPEnv][X86] Platform builtins edition: clang should get from the AST the metadata for constrained FP builtins
Currently clang is not correctly retrieving from the AST the metadata for
constrained FP builtins. This patch fixes that for the X86 specific builtins.

Differential Revision: https://reviews.llvm.org/D94614
2021-02-03 11:49:17 -05:00
Hongtao Yu 3d89b3cbec [CSSPGO] Introducing distribution factor for pseudo probe.
Sample re-annotation is required in LTO time to achieve a reasonable post-inline profile quality. However, we have seen that such LTO-time re-annotation degrades profile quality. This is mainly caused by preLTO code duplication that is done by passes such as loop unrolling, jump threading, indirect call promotion etc, where samples corresponding to a source location are aggregated multiple times due to the duplicates. In this change we are introducing a concept of distribution factor for pseudo probes so that samples can be distributed for duplicated probes scaled by a factor. We hope that optimizations duplicating code well-maintain the branch frequency information (BFI) based on which probe distribution factors are calculated. Distribution factors are updated at the end of preLTO pipeline to reflect an estimated portion of the real execution count.

This change also introduces a pseudo probe verifier that can be run after each IR passes to detect duplicated pseudo probes.

A saturated distribution factor stands for 1.0. A pesudo probe will carry a factor with the value ranged from 0.0 to 1.0. A 64-bit integral distribution factor field that represents [0.0, 1.0] is associated to each block probe. Unfortunately this cannot be done for callsite probes due to the size limitation of a 32-bit Dwarf discriminator. A 7-bit distribution factor is used instead.

Changes are also needed to the sample profile inliner to deal with prorated callsite counts. Call sites duplicated by PreLTO passes, when later on inlined in LTO time, should have the callees’s probe prorated based on the Prelink-computed distribution factors. The distribution factors should also be taken into account when computing hotness for inline candidates. Also, Indirect call promotion results in multiple callisites. The original samples should be distributed across them. This is fixed by adjusting the callisites' distribution factors.

Reviewed By: wmi

Differential Revision: https://reviews.llvm.org/D93264
2021-02-02 11:55:01 -08:00
Fangrui Song 74c94b5d9c [test] Default clang/test to FileCheck --allow-unused-prefixes=false 2021-02-02 11:22:46 -08:00
Zarko Todorovski eb3426a528 [AIX] Improve option processing for mabi=vec-extabi and mabi=vec=defaul
Opening this revision to better address comments by @hubert.reinterpretcast in https://reviews.llvm.org/rGcaaaebcde462

Reviewed By: hubert.reinterpretcast

Differential Revision: https://reviews.llvm.org/D95702
2021-02-02 10:59:21 -05:00
Nico Weber f2b4cc91e0 Revert "[test] Default clang/test to FileCheck --allow-unused-prefixes=false"
This reverts commit 80f539526e.
Many test failures on mac: http://45.33.8.238/macm1/2772/summary.html
One on win: http://45.33.8.238/win/32442/summary.html
2021-02-02 07:38:44 -05:00
Fangrui Song 80f539526e [test] Default clang/test to FileCheck --allow-unused-prefixes=false 2021-02-01 22:02:59 -08:00
Petr Hosek 0217f1c7a3 Make the profile-filter.c test compatible with 32-bit systems
This addresses PR48930.

Differential Revision: https://reviews.llvm.org/D95658
2021-01-29 09:58:32 -08:00
Abhina Sreeskantharajan 42a21778f6 [test] Use host platform specific error message substitution in lit tests
On z/OS, the following error message is not matched correctly in lit tests.

```
EDC5129I No such file or directory.
```

This patch uses a lit config substitution to check for platform specific error messages.

Reviewed By: muiez, jhenderson

Differential Revision: https://reviews.llvm.org/D95246
2021-01-29 07:16:30 -05:00
Thomas Lively 4b68b64dcc [WebAssembly] Prototype i8x16 to i32x4 widening instructions
As proposed in https://github.com/WebAssembly/simd/pull/395 and matching the
opcodes used in V8:
https://chromium-review.googlesource.com/c/v8/v8/+/2617385/4/src/wasm/wasm-opcodes.h

Differential Revision: https://reviews.llvm.org/D95557
2021-01-28 10:59:32 -08:00
James Y Knight a7246ba02a Itanium Mangling: In 'enable_if', omit X/E around <expr-primary>.
The Clang enable_if extension is mangled as an <extended-qualifier>,
which is supposed to contain <template-args>. However, we were
unconditionally emitting X/E around its arguments, neglecting the fact
that <expr-primary> should be emitted directly without the surrounding
X/E.

Differential Revision: https://reviews.llvm.org/D95488
2021-01-27 16:46:52 -05:00
Fangrui Song 3e80686186 [test] Fix clang/test/CodeGen tests 2021-01-27 10:55:27 -08:00
Freddy Ye 1edb76cc91 [X86] merge "={eax}" and "~{eax}" into "=&eax" for MSInlineASM
Reviewed By: pengfei

Differential Revision: https://reviews.llvm.org/D94466
2021-01-27 22:54:17 +08:00
Petr Hosek bb9eb19829 Support for instrumenting only selected files or functions
This change implements support for applying profile instrumentation
only to selected files or functions. The implementation uses the
sanitizer special case list format to select which files and functions
to instrument, and relies on the new noprofile IR attribute to exclude
functions from instrumentation.

Differential Revision: https://reviews.llvm.org/D94820
2021-01-26 17:13:34 -08:00
Petr Hosek 1e634f3952 Revert "Support for instrumenting only selected files or functions"
This reverts commit 4edf35f11a because
the test fails on Windows bots.
2021-01-26 12:25:28 -08:00
Fangrui Song 189f311130 CGDebugInfo CreatedLimitedType: Drop file/line for RecordType with invalid location
For Clang synthesized `__va_list_tag` (`CreateX86_64ABIBuiltinVaListDecl`),
its DW_AT_decl_file/DW_AT_decl_line are arbitrarily set from `CurLoc`.

In a stage 2 `-DCMAKE_BUILD_TYPE=Debug` clang build, I observe that
in driver.cpp, DW_AT_decl_file/DW_AT_decl_line may be set to an `#include` line
(the transitively included file uses va_arg (`__builtin_va_arg`)).
This seems arbitrary. Drop that.

Reviewed By: #debug-info, dblaikie

Differential Revision: https://reviews.llvm.org/D94735
2021-01-26 11:53:25 -08:00
Petr Hosek 4edf35f11a Support for instrumenting only selected files or functions
This change implements support for applying profile instrumentation
only to selected files or functions. The implementation uses the
sanitizer special case list format to select which files and functions
to instrument, and relies on the new noprofile IR attribute to exclude
functions from instrumentation.

Differential Revision: https://reviews.llvm.org/D94820
2021-01-26 11:11:39 -08:00
Mircea Trofin 0c0d009a88 [NFC] Disallow unused prefixes under clang/test/CodeGen
Differential Revision: https://reviews.llvm.org/D95417
2021-01-26 08:05:45 -08:00
Zarko Todorovski 028d7a3668 Remove requirement for -maltivec to be used when using -mabi=vec-extabi or -mabi=vec-default when not using vector code
The previous implementation required that `-maltivec` be specified when using either `-mabi=vec-extabi` or `-mabi=vec-default`, this patch removes that requirement.

Reviewed By: cebowleratibm

Differential Revision: https://reviews.llvm.org/D94986
2021-01-26 07:58:01 -05:00
Abhina Sreeskantharajan 978444d531 Revert "[SystemZ][z/OS] Fix No such file or directory expression error"
This reverts commit 06f8a49693.
2021-01-25 08:29:38 -05:00
Jeroen Dobbelaere 2b9a834c43 [InlineFunction] Use llvm.experimental.noalias.scope.decl for noalias arguments.
Insert a llvm.experimental.noalias.scope.decl intrinsic that identifies where a noalias argument was inlined.

This patch includes some refactorings from D90104.

Reviewed By: nikic

Differential Revision: https://reviews.llvm.org/D93040
2021-01-23 12:10:57 +01:00
Thomas Lively 11802eced5 [WebAssembly] Prototype new f64x2 conversions
As proposed in https://github.com/WebAssembly/simd/pull/383.

Differential Revision: https://reviews.llvm.org/D95012
2021-01-20 11:28:06 -08:00
George Burgess IV b270fd59f0 Revert "[clang] Change builtin object size when subobject is invalid"
This reverts commit 275f30df8a.

As noted on the code review (https://reviews.llvm.org/D92892), this
change causes us to reject valid code in a few cases. Reverting so we
have more time to figure out what the right fix{es are, is} here.
2021-01-20 11:03:34 -08:00
Luo, Yuanke 7e1d2224b4 [X86][AMX] Fix the typo.
The dpbsud should be dpbssd.

Differential Revision: https://reviews.llvm.org/D94943
2021-01-19 16:57:34 +08:00
Florian Hahn 291ac7e622
[AArch64] Revert back to Intrinsic<> for TME instructions.
This patch reverts back to Intrinsic for the instructions for the
transactional memory extension, so nosync is not included.
2021-01-18 18:03:58 +00:00
Abhina Sreeskantharajan 689aaba7ac [SystemZ][z/OS] Fix No such file or directory expression error matching in lit tests
On z/OS, the following error message is not matched correctly in lit tests. This patch updates the CHECK expression to match successfully.
```
EDC5129I No such file or directory.
```

Reviewed By: muiez

Differential Revision: https://reviews.llvm.org/D94239
2021-01-18 07:14:37 -05:00
Mircea Trofin e8049dc3c8 [NewPM][Inliner] Move the 'always inliner' case in the same CGSCC pass as 'regular' inliner
Expanding from D94808 - we ensure the same InlineAdvisor is used by both
InlinerPass instances. The notion of mandatory inlining is moved into
the core InlineAdvisor: advisors anyway have to handle that case, so
this change also factors out that a bit better.

Differential Revision: https://reviews.llvm.org/D94825
2021-01-15 17:59:38 -08:00
Qiu Chaofan 168be42083 [Clang] Mutate long-double math builtins into f128 under IEEE-quad
Under -mabi=ieeelongdouble on PowerPC, IEEE-quad floating point semantic
is used for long double. This patch mutates call to related builtins
into f128 version on PowerPC. And in theory, this should be applied to
other targets when their backend supports IEEE 128-bit style libcalls.

GCC already has these mutations except nansl, which is not available on
PowerPC along with other variants (nans, nansf).

Reviewed By: RKSimon, nemanjai

Differential Revision: https://reviews.llvm.org/D92080
2021-01-15 16:56:20 +08:00
Erich Keane 9e53c94d8d [NFC] Update test to not check for 'opaque' in the file name.
The intent presumably is to avoid generating 'opaque' in the IR, but the
header contains the filename. Thus, having the workspace in a directory
with opaque in it causes this test to fail.

This just adds a 'CHECK' line on target-triple, which is the last line
of the IR-header.
2021-01-14 11:24:06 -08:00
Lucas Prates 2b1e25befe [AArch64] Adding ACLE intrinsics for the LS64 extension
This introduces the ARMv8.7-A LS64 extension's intrinsics for 64 bytes
atomic loads and stores: `__arm_ld64b`, `__arm_st64b`, `__arm_st64bv`,
and `__arm_st64bv0`. These are selected into the LS64 instructions
LD64B, ST64B, ST64BV and ST64BV0, respectively.

Based on patches written by Simon Tatham.

Reviewed By: tmatheson

Differential Revision: https://reviews.llvm.org/D93232
2021-01-14 09:43:58 +00:00
Zequan Wu e53bbd9951 [IR] move nomerge attribute from function declaration/definition to callsites
Move nomerge attribute from function declaration/definition to callsites to
allow virtual function calls attach the attribute.

Differential Revision: https://reviews.llvm.org/D94537
2021-01-12 12:10:46 -08:00
Jan Svoboda 7ab803095a [clang][cli] Remove -f[no-]trapping-math from -cc1 command line
This patch removes the -f[no-]trapping-math flags from the -cc1 command line. These flags are ignored in the command line parser and their semantics is fully handled by -ffp-exception-mode.

This patch does not remove -f[no-]trapping-math from the driver command line. The driver flags are being used and do affect compilation.

Reviewed By: dexonsmith, SjoerdMeijer

Differential Revision: https://reviews.llvm.org/D93395
2021-01-12 10:00:23 +01:00
Sriraman Tallam d8c6d24359 -funique-internal-linkage-names appends a hex md5hash suffix to the symbol name which is not demangler friendly, convert it to decimal.
Please see D93747 for more context which tries to make linkage names of internal
linkage functions to be the uniqueified names. This causes a problem with gdb
because breaking using the demangled function name will not work if the new
uniqueified name cannot be demangled. The problem is the generated suffix which
is a mix of integers and letters which do not demangle. The demangler accepts
either all numbers or all letters. This patch simply converts the hash to decimal.

There is no loss of uniqueness by doing this as the precision is maintained.
The symbol names get longer by a few characters though.

Differential Revision: https://reviews.llvm.org/D94154
2021-01-11 11:10:29 -08:00
Joe Ellis 8ea72b3887 [clang][AArch64][SVE] Avoid going through memory for coerced VLST return values
VLST return values are coerced to VLATs in the function epilog for
consistency with the VLAT ABI. Previously, this coercion was done
through memory. It is preferable to use the
llvm.experimental.vector.insert intrinsic to avoid going through memory
here.

Reviewed By: c-rhodes

Differential Revision: https://reviews.llvm.org/D94290
2021-01-11 12:10:59 +00:00
Esme-Yi ffa67873a3 [PowerPC] Add variants of 64-bit vector types for vec_sel.
Summary: This patch added variants of vec_sel and fixed bugzilla 46770.

Reviewed By: nemanjai

Differential Revision: https://reviews.llvm.org/D94162
2021-01-11 03:52:16 +00:00
Fangrui Song b41b743d46 [test] Improve weakref & weak_import tests 2021-01-09 23:56:55 -08:00
Fangrui Song e2e82c9983 [CodeGenModule] Drop dso_local on function declarations for ELF -fno-pic -fno-direct-access-external-data
ELF -fno-pic sets dso_local on a function declaration to allow direct accesses
when taking its address (similar to a data symbol). The emitted code follows the
traditional GCC/Clang -fno-pic behavior: an absolute relocation is produced.

If the function is not defined in the executable, a canonical PLT entry will be
needed at link time. This is similar to a copy relocation and is incompatible
with (-Bsymbolic or --dynamic-list linked shared objects / protected symbols in
a shared object).

This patch gives -fno-pic code a way to avoid such a canonical PLT entry.

The FIXME was about a generalization for -fpie -mpie-copy-relocations (now -fpie
-fdirect-access-external-data). While we could set dso_local to avoid GOT when
taking the address of a function declaration (there is an ignorable difference
about R_386_PC32 vs R_386_PLT32 on i386), it likely does not provide any benefit
and can just cause trouble, so we don't make the generalization.
2021-01-09 16:31:56 -08:00
Fangrui Song 38a716c30f Make -fno-pic respect -fno-direct-access-external-data
D92633 added -f[no-]direct-access-external-data to supersede -m[no-]pie-copy-relocations.
(The option works for -fpie but is a no-op for -fno-pic and -fpic.)

This patch makes -fno-pic -fno-direct-access-external-data drop dso_local from
global variable declarations. This usually causes the backend to emit a GOT
indirection for external data access. With a GOT relocation, the subsequent
-no-pie link will not have copy relocation even if the data symbol turns out to
be defined by a shared object.

Differential Revision: https://reviews.llvm.org/D92714
2021-01-09 00:32:02 -08:00
Fangrui Song 1d3ebbf537 Add -f[no-]direct-access-external-data to supersede -mpie-copy-relocations
GCC r218397 "x86-64: Optimize access to globals in PIE with copy reloc" made
-fpie code emit R_X86_64_PC32 to reference external data symbols by default.
Clang adopted -mpie-copy-relocations D19996 as a flexible alternative.

The name -mpie-copy-relocations can be improved [1] and does not capture the
idea that this option can apply to -fno-pic and -fpic [2], so this patch
introduces -f[no-]direct-access-external-data and makes -mpie-copy-relocations
their aliases for compatibility.

[1]
For
```
extern int var;
int get() { return var; }
```
if var is defined in another translation unit in the link unit, there is no copy
relocation.

[2]
-fno-pic -fno-direct-access-external-data is useful to avoid copy relocations.
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=65888
If a shared object is linked with -Bsymbolic or --dynamic-list and exports a
data symbol, normally the data symbol cannot be accessed by -fno-pic code
(because by default an absolute relocation is produced which will lead to a copy
relocation). -fno-direct-access-external-data can prevent copy relocations.

-fpic -fdirect-access-external-data can avoid GOT indirection. This is like the
undefined counterpart of -fno-semantic-interposition. However, the user should
define var in another translation unit and link with -Bsymbolic or
--dynamic-list, otherwise the linker will error in a -shared link. Generally
the user has better tools for their goal but I want to mention that this
combination is valid.

On COFF, the behavior is like always -fdirect-access-external-data.
`__declspec(dllimport)` is needed to enable indirect access.

There is currently no plan to affect non-ELF behaviors or -fpic behaviors.

-fno-pic -fno-direct-access-external-data will be implemented in the subsequent patch.

GCC feature request https://gcc.gnu.org/bugzilla/show_bug.cgi?id=98112

Reviewed By: tmsriram

Differential Revision: https://reviews.llvm.org/D92633
2021-01-09 00:32:01 -08:00
Umesh Kalappa 33c8e16f66 PR47391: Canonicalize DIFiles
Like @aprantl suggested, modify to  use the canonicalized DIFile, if we
don't know the  loc info and filename for the compiler generated
functions for example static initialization functions.

Reviewed By: dblaikie, aprantl

Differential Revision: https://reviews.llvm.org/D87147
2021-01-08 22:11:16 -08:00
Heejin Ahn 7be271537e [WebAssembly] Rename wasm_rethrow_in_catch intrinsic/builtin
`wasm_rethrow_in_catch` intrinsic and builtin are used in order to
rethrow an exception when the exception is caught but there is no
matching clause within the current `catch`. For example,
```
try {
  foo();
} catch (int n) {
  ...
}
```
If the caught exception does not correspond to C++ `int` type, it should
be rethrown. These intrinsic/builtin were renamed `rethrow_in_catch`
because at the time I thought there would be another intrinsic for C++'s
`throw` keyword, which rethrows an exception. It turned out that `throw`
keyword doesn't require wasm's `rethrow` instruction, so we rename
`rethrow_in_catch` to just `rethrow` here.

Reviewed By: dschuff, tlively

Differential Revision: https://reviews.llvm.org/D94038
2021-01-08 06:55:04 -08:00
Jeffrey T Mott 275f30df8a [clang] Change builtin object size when subobject is invalid
Motivating example:

```
  struct { int v[10]; } t[10];

  __builtin_object_size(
      &t[0].v[11], // access past end of subobject
      1            // request remaining bytes of closest surrounding
                   // subobject
  );
```

In GCC, this returns 0. https://godbolt.org/z/7TeGs7

In current clang, however, this returns 356, the number of bytes
remaining in the whole variable, as if the `type` was 0 instead of 1.
https://godbolt.org/z/6Kffox

This patch checks for the specific case where we're requesting a
subobject's size (type 1) but the subobject is invalid.

Differential Revision: https://reviews.llvm.org/D92892
2021-01-07 12:34:07 -08:00
Jeroen Dobbelaere 59fce6b066 [NFC] make clang/test/CodeGen/arm_neon_intrinsics.c resistent to function attribute id changes
When introducing support for @llvm.experimental.noalias.scope.decl, this tests started failing because it checks
(for no good reason) for a function attribute id of '#8' which now becomes '#9'

Reviewed By: pratlucas

Differential Revision: https://reviews.llvm.org/D94233
2021-01-07 17:08:15 +00:00
Thomas Lively 497026c902 [WebAssembly] Prototype prefetch instructions
As proposed in https://github.com/WebAssembly/simd/pull/352 and using the
opcodes used in the V8 prototype:
https://chromium-review.googlesource.com/c/v8/v8/+/2543167. These instructions
are only usable via intrinsics and clang builtins to make them opt-in while they
are being benchmarked.

Differential Revision: https://reviews.llvm.org/D93883
2021-01-05 11:32:03 -08:00
Florian Hahn 51d5991f04
[Clang] Add AArch64 VCMLA LANE variants.
This patch adds the LANE variants for VCMLA on AArch64 as defined in
"Arm Neon Intrinsics Reference for ACLE Q3 2020" [1]

This patch also updates `dup_typed` to accept constant type strings directly.

Based on a patch by Tim Northover.

[1] https://developer.arm.com/documentation/ihi0073/latest

Reviewed By: SjoerdMeijer

Differential Revision: https://reviews.llvm.org/D93014
2021-01-05 16:14:00 +00:00
Joe Ellis 3d5b18a3fd [clang][AArch64][SVE] Avoid going through memory for coerced VLST arguments
VLST arguments are coerced to VLATs at the function boundary for
consistency with the VLAT ABI. They are then bitcast back to VLSTs in
the function prolog. Previously, this conversion is done through memory.
With the introduction of the llvm.vector.{insert,extract} intrinsic, we
can avoid going through memory here.

Depends on D92761

Differential Revision: https://reviews.llvm.org/D92762
2021-01-05 15:18:21 +00:00
Brandon Bergren 6cee9d0cf8 [PowerPC] Support powerpcle target in Clang [3/5]
Add powerpcle support to clang.

For FreeBSD, assume a freestanding environment for now, as we only need it in the first place to build loader, which runs in the OpenFirmware environment instead of the FreeBSD environment.

For Linux, recognize glibc and musl environments to match current usage in Void Linux PPC.

Adjust driver to match current binutils behavior regarding machine naming.

Adjust and expand tests.

Reviewed By: MaskRay

Differential Revision: https://reviews.llvm.org/D93919
2021-01-02 12:17:58 -06:00
Fangrui Song d1fd72343c Refactor how -fno-semantic-interposition sets dso_local on default visibility external linkage definitions
The idea is that the CC1 default for ELF should set dso_local on default
visibility external linkage definitions in the default -mrelocation-model pic
mode (-fpic/-fPIC) to match COFF/Mach-O and make output IR similar.

The refactoring is made available by 2820a2ca3a.

Currently only x86 supports local aliases. We move the decision to the driver.
There are three CC1 states:

* -fsemantic-interposition: make some linkages interposable and make default visibility external linkage definitions dso_preemptable.
* (default): selected if the target supports .Lfoo$local: make default visibility external linkage definitions dso_local
* -fhalf-no-semantic-interposition: if neither option is set or the target does not support .Lfoo$local: like -fno-semantic-interposition but local aliases are not used. So references can be interposed if not optimized out.

Add -fhalf-no-semantic-interposition to a few tests using the half-based semantic interposition behavior.
2020-12-31 13:59:45 -08:00
Fangrui Song fd739804e0 [test] Add {{.*}} to make ELF tests immune to dso_local/dso_preemptable/(none) differences
For a default visibility external linkage definition, dso_local is set for ELF
-fno-pic/-fpie and COFF and Mach-O. Since default clang -cc1 for ELF is similar
to -fpic ("PIC Level" is not set), this nuance causes unneeded binary format differences.

To make emitted IR similar, ELF -cc1 -fpic will default to -fno-semantic-interposition,
which sets dso_local for default visibility external linkage definitions.

To make this flip smooth and enable future (dso_local as definition default),
this patch replaces (function) `define ` with `define{{.*}} `,
(variable/constant/alias) `= ` with `={{.*}} `, or inserts appropriate `{{.*}} `.
2020-12-31 00:27:11 -08:00
Fangrui Song f2cc2669a0 [test] Fix -triple and delete UNSUPPORTED: system-windows 2020-12-31 00:13:34 -08:00
Luo, Yuanke 08665b1805 Support tilezero intrinsic and c interface for AMX.
Differential Revision: https://reviews.llvm.org/D92837
2020-12-31 13:24:57 +08:00
Fangrui Song 6b3351792c [test] Add {{.*}} to make tests immune to dso_local/dso_preemptable/(none) differences
For a definition (of most linkage types), dso_local is set for ELF -fno-pic/-fpie
and COFF, but not for Mach-O.  This nuance causes unneeded binary format differences.

This patch replaces (function) `define ` with `define{{.*}} `,
(variable/constant/alias) `= ` with `={{.*}} `, or inserts appropriate `{{.*}} `
if there is an explicit linkage.

* Clang will set dso_local for Mach-O, which is currently implied by TargetMachine.cpp. This will make COFF/Mach-O and executable ELF similar.
* Eventually I hope we can make dso_local the textual LLVM IR default (write explicit "dso_preemptable" when applicable) and -fpic ELF will be similar to everything else. This patch helps move toward that goal.
2020-12-30 20:52:01 -08:00
Juneyoung Lee 420d046d6b clang-format, address warnings 2020-12-30 23:05:07 +09:00
Juneyoung Lee 9b29610228 Use unary CreateShuffleVector if possible
As mentioned in D93793, there are quite a few places where unary `IRBuilder::CreateShuffleVector(X, Mask)` can be used
instead of `IRBuilder::CreateShuffleVector(X, Undef, Mask)`.
Let's update them.

Actually, it would have been more natural if the patches were made in this order:
(1) let them use unary CreateShuffleVector first
(2) update IRBuilder::CreateShuffleVector to use poison as a placeholder value (D93793)

The order is swapped, but in terms of correctness it is still fine.

Reviewed By: spatel

Differential Revision: https://reviews.llvm.org/D93923
2020-12-30 22:36:08 +09:00
Fangrui Song 2820a2ca3a Move -fno-semantic-interposition dso_local logic from TargetMachine to Clang CodeGenModule
This simplifies TargetMachine::shouldAssumeDSOLocal and and gives frontend the
decision to use dso_local. For LLVM synthesized functions/globals, they may lose
inferred dso_local but such optimizations are probably not very useful.

Note: the hasComdat() condition in canBenefitFromLocalAlias (D77429) may be dead now.
(llvm/CodeGen/X86/semantic-interposition-comdat.ll)
(Investigate whether we need test coverage when Fuchsia C++ ABI is clearer)
2020-12-29 23:37:55 -08:00
Luo, Yuanke 981a0bd858 [X86] Add x86_amx type for intel AMX.
The x86_amx is used for AMX intrisics. <256 x i32> is bitcast to x86_amx when
it is used by AMX intrinsics, and x86_amx is bitcast to <256 x i32> when it
is used by load/store instruction. So amx intrinsics only operate on type x86_amx.
It can help to separate amx intrinsics from llvm IR instructions (+-*/).
Thank Craig for the idea. This patch depend on https://reviews.llvm.org/D87981.

Differential Revision: https://reviews.llvm.org/D91927
2020-12-30 13:52:13 +08:00
Juneyoung Lee 278aa65cc4 [IR] Let IRBuilder's CreateVectorSplat/CreateShuffleVector use poison as placeholder
This patch updates IRBuilder to create insertelement/shufflevector using poison as a placeholder.

Reviewed By: nikic

Differential Revision: https://reviews.llvm.org/D93793
2020-12-30 04:21:04 +09:00
Thomas Lively 5e09e9979b [WebAssembly] Prototype extending pairwise add instructions
As proposed in https://github.com/WebAssembly/simd/pull/380. This commit makes
the new instructions available only via clang builtins and LLVM intrinsics to
make their use opt-in while they are still being evaluated for inclusion in the
SIMD proposal.

Depends on D93771.

Differential Revision: https://reviews.llvm.org/D93775
2020-12-28 14:11:14 -08:00
Juneyoung Lee 9d70dbdc2b [InstCombine] use poison as placeholder for undemanded elems
Currently undef is used as a don’t-care vector when constructing a vector using a series of insertelement.
However, this is problematic because undef isn’t undefined enough.
Especially, a sequence of insertelement can be optimized to shufflevector, but using undef as its placeholder makes shufflevector a poison-blocking instruction because undef cannot be optimized to poison.
This makes a few straightforward optimizations incorrect, such as:

```
;  https://bugs.llvm.org/show_bug.cgi?id=44185

define <4 x float> @insert_not_undef_shuffle_translate_commute(float %x, <4 x float> %y, <4 x float> %q) {
  %xv = insertelement <4 x float> %q, float %x, i32 2
  %r = shufflevector <4 x float> %y, <4 x float> %xv, <4 x i32> { 0, 6, 2, undef }
  ret <4 x float> %r ; %r[3] is undef
}
=>
define <4 x float> @insert_not_undef_shuffle_translate_commute(float %x, <4 x float> %y, <4 x float> %q) {
  %r = insertelement <4 x float> %y, float %x, i32 1
  ret <4 x float> %r ; %r[3] = %y[3], incorrect if %y[3] = poison
}

Transformation doesn't verify!
ERROR: Target is more poisonous than source
```

I’d like to suggest
1. Using poison as insertelement’s placeholder value (IRBuilder::CreateVectorSplat should be patched too)
2. Updating shufflevector’s semantics to return poison element if mask is undef

Note that poison is currently lowered into UNDEF in SelDag, so codegen part is okay.
m_Undef() matches PoisonValue as well, so existing optimizations will still fire.

The only concern is hidden miscompilations that will go incorrect when poison constant is given.
A conservative way is copying all tests having `insertelement undef` & replacing it with `insertelement poison` & run Alive2 on it, but it will create many tests and people won’t like it. :(

Instead, I’ll simply locally maintain the tests and run Alive2.
If there is any bug found, I’ll report it.

Relevant links: https://bugs.llvm.org/show_bug.cgi?id=43958 , http://lists.llvm.org/pipermail/llvm-dev/2019-November/137242.html

Reviewed By: nikic

Differential Revision: https://reviews.llvm.org/D93586
2020-12-28 08:58:15 +09:00
Sriraman Tallam 34e70d722d Append ".__part." to every basic block section symbol.
Every basic block section symbol created by -fbasic-block-sections will contain
".__part." to know that this symbol corresponds to a basic block fragment of
the function.

This patch solves two problems:

a) Like D89617, we want function symbols with suffixes to be properly qualified
   so that external tools like profile aggregators know exactly what this
   symbol corresponds to.
b) The current basic block naming just adds a ".N" to the symbol name where N is
   some integer. This collides with how clang creates __cxx_global_var_init.N.
   clang creates these symbol names to call constructor functions and basic
   block symbol naming should not use the same style.

Fixed all the test cases and added an extra test for __cxx_global_var_init
breakage.

Differential Revision: https://reviews.llvm.org/D93082
2020-12-23 11:35:44 -08:00
Arthur Eubanks db1616c768 [test] Fix new-pass-manager-opt-bisect.c
Requires x86 target to be registered.
2020-12-20 17:13:42 -08:00
Samuel Eubanks 47dbee6790 Make NPM OptBisectInstrumentation use global singleton OptBisect
Currently there is an issue where the legacy pass manager uses a different OptBisect counter than the new pass manager.
This fix makes the npm OptBisectInstrumentation use the global OptBisect.

Reviewed By: aeubanks

Differential Revision: https://reviews.llvm.org/D92897
2020-12-20 13:47:56 -08:00
Roman Lebedev 897c985e1e
[InstCombine] Canonicalize SPF to abs intrinsic
This patch enables canonicalization of SPF_ABS and SPF_ABS
to the abs intrinsic.

This is a recommit, the original try was
05d4c4ebc2,
but it was reverted due to an apparent miscompile,
which since then has just been fixed by the previous commit.

Differential Revision: https://reviews.llvm.org/D87188
2020-12-18 21:18:14 +03:00
Kevin P. Neal 7fef551cb1 Revert "Revert "[FPEnv] Teach the IRBuilder about invoke's correct use of the strictfp attribute.""
Similar to D69312, and documented in D69839, the IRBuilder needs to add
the strictfp attribute to invoke instructions when constrained floating
point is enabled.

This is try 2, with the test corrected.

Differential Revision: https://reviews.llvm.org/D93134
2020-12-18 12:42:06 -05:00
Rong Xu 3733463dbb [IR][PGO] Add hot func attribute and use hot/cold attribute in func section
Clang FE currently has hot/cold function attribute. But we only have
cold function attribute in LLVM IR.

This patch adds support of hot function attribute to LLVM IR.  This
attribute will be used in setting function section prefix/suffix.
Currently .hot and .unlikely suffix only are added in PGO (Sample PGO)
compilation (through isFunctionHotInCallGraph and
isFunctionColdInCallGraph).

This patch changes the behavior. The new behavior is:
(1) If the user annotates a function as hot or isFunctionHotInCallGraph
    is true, this function will be marked as hot. Otherwise,
(2) If the user annotates a function as cold or
    isFunctionColdInCallGraph is true, this function will be marked as
    cold.

The changes are:
(1) user annotated function attribute will used in setting function
    section prefix/suffix.
(2) hot attribute overwrites profile count based hotness.
(3) profile count based hotness overwrite user annotated cold attribute.

The intention for these changes is to provide the user a way to mark
certain function as hot in cases where training input is hard to cover
all the hot functions.

Differential Revision: https://reviews.llvm.org/D92493
2020-12-17 18:41:12 -08:00
Tom Stellard 3203143f13 CodeGen: Improve generated IR for __builtin_mul_overflow(uint, uint, int)
Add a special case for handling __builtin_mul_overflow with unsigned
inputs and a signed output to avoid emitting the __muloti4 library
call on x86_64.  __muloti4 is not implemented in libgcc, so avoiding
this call fixes compilation of some programs that call
__builtin_mul_overflow with these arguments.

For example, this fixes the build of cpio with clang, which includes code from
gnulib that calls __builtin_mul_overflow with these argument types.

Reviewed By: vsk

Differential Revision: https://reviews.llvm.org/D84405
2020-12-17 14:30:31 -08:00
Baptiste Saleil c2892978e9 [PowerPC] Rename the vector pair intrinsics and builtins to replace the _mma_ prefix by _vsx_
On PPC, the vector pair instructions are independent from MMA.
This patch renames the vector pair LLVM intrinsics and Clang builtins to replace the _mma_ prefix by _vsx_ in their names.
We also move the vector pair type/intrinsic/builtin tests to their own files.

Differential Revision: https://reviews.llvm.org/D91974
2020-12-17 13:19:27 -05:00
Tomas Matheson f500662924 Detect section type conflicts between functions and variables
If two variables are declared with __attribute__((section(name))) and
the implicit section types (e.g. read only vs writeable) conflict, an
error is raised. Extend this mechanism so that an error is raised if the
section type implied by a function's __attribute__((section)) conflicts
with that of another variable.
2020-12-17 11:43:47 -05:00
Zequan Wu fb0f728805 [Clang] Make nomerge attribute a function attribute as well as a statement attribute.
Differential Revision: https://reviews.llvm.org/D92800
2020-12-17 07:45:38 -08:00
Thomas Preud'homme 150fe05db4 [Test] Fix undef var in catch-undef-behavior.c
Commit 9e52c43090 removed the directive
defining LINE_1600 but left a string substitution to that variable in a
CHECK-NOT directive. This will make that CHECK-NOT directive always fail
to match, no matter the string.

This commit follows the pattern done in
9e52c43090 of simplifying the CHECK-NOT to
only look for the function name and the opening parenthesis, thereby not
requiring the LINE_1600 variable.

Reviewed By: rnk

Differential Revision: https://reviews.llvm.org/D93350
2020-12-16 22:39:41 +00:00
Joe Ellis dad07baf12 [clang][AArch64][SVE] Avoid going through memory for VLAT <-> VLST casts
This change makes use of the llvm.vector.extract intrinsic to avoid
going through memory when performing bitcasts between vector-length
agnostic types and vector-length specific types.

Depends on D91362

Reviewed By: c-rhodes

Differential Revision: https://reviews.llvm.org/D92761
2020-12-16 12:24:32 +00:00
Qiu Chaofan f141d1afc5 [NFC] Pre-commit test for long-double builtins
This test reflects clang behavior on long-double type math library
builtins under default or explicit 128-bit long-double options.
2020-12-16 17:19:54 +08:00
Johannes Doerfert b9c77542e2 [Clang][Attr] Introduce the `assume` function attribute
The `assume` attribute is a way to provide additional, arbitrary
information to the optimizer. For now, assumptions are restricted to
strings which will be accumulated for a function and emitted as comma
separated string function attribute. The key of the LLVM-IR function
attribute is `llvm.assume`. Similar to `llvm.assume` and
`__builtin_assume`, the `assume` attribute provides a user defined
assumption to the compiler.

A follow up patch will introduce an LLVM-core API to query the
assumptions attached to a function. We also expect to add more options,
e.g., expression arguments, to the `assume` attribute later on.

The `omp [begin] asssumes` pragma will leverage this attribute and
expose the functionality in the absence of OpenMP.

Reviewed By: aaron.ballman

Differential Revision: https://reviews.llvm.org/D91979
2020-12-15 16:51:34 -06:00
Kevin P. Neal 2ec5973fdd Revert "[FPEnv] Teach the IRBuilder about invoke's correct use of the strictfp attribute."
The test is busted on some hosts that aren't the one I'm using.

This reverts commit 67a1ffd88a.
2020-12-15 12:58:47 -05:00
Kevin P. Neal 67a1ffd88a [FPEnv] Teach the IRBuilder about invoke's correct use of the strictfp attribute.
Similar to D69312, and documented in D69839, the IRBuilder needs to add
the strictfp attribute to invoke instructions when constrained floating
point is enabled.

Differential Revision: https://reviews.llvm.org/D93134
2020-12-15 12:38:10 -05:00
Joe Ellis 5a2a8369e8 [AArch64][NEON] Remove undocumented vceqz{,q}_p16, vml{a,s}q_n_f64 intrinsics
Prior to this patch, Clang supported the following C/C++ intrinsics:

    vceqz_p16
    vceqzq_p16
    vmlaq_n_f64
    vmlsq_n_f64

... exposed through arm_neon.h. However, these intrinsics are not part
of the ACLE, allowing developers to write code that is not compatible
with other toolchains.

This patch removes these intrinsics.

There is a bug report capturing this issue here:

    https://bugs.llvm.org/show_bug.cgi?id=47471

Reviewed By: bsmith

Differential Revision: https://reviews.llvm.org/D93206
2020-12-15 17:19:16 +00:00
Jan Svoboda 56c5548d7f [clang][cli] Squash multiple cc1 -fxxx-exceptions flags into single -exception-model=xxx option
This patch enables marshalling of the exception model options while enforcing their mutual exclusivity. The clang driver interface remains the same, this only affects the cc1 command line.

Depends on D93215.

Reviewed By: dexonsmith

Differential Revision: https://reviews.llvm.org/D93216
2020-12-15 10:15:58 +01:00
Gulfem Savrun Yeniceri 7c0e3a77bc [clang][IR] Add support for leaf attribute
This patch adds support for leaf attribute as an optimization hint
in Clang/LLVM.

Differential Revision: https://reviews.llvm.org/D90275
2020-12-14 14:48:17 -08:00
Philip Reames 3b3eb7f07f Speculative fix for build bot failures
(The clang build fails for me locally, so this is based on built bot output and a guess as to root cause.)

f5fe849 made the execution of LAA conditional, so I'm guessing that's the root cause.
2020-12-14 13:44:40 -08:00
Matt Arsenault ef4da3c2ba clang: Add byval on x86_intrcc parameter 0
This will allow removing the special case treatment of the parameter
and avoid depending on the pointer's element type.
2020-12-14 16:34:37 -05:00
Simon Pilgrim 4855a1004d [X86] Convert fadd/fmul _mm_reduce_* intrinsics to emit llvm.reduction intrinsics (PR47506)
Followup to D87604, having confirmed on PR47506 that we can use the llvm codegen expansion for fadd/fmul as well.

Differential Revision: https://reviews.llvm.org/D92940
2020-12-13 15:37:35 +00:00
Alexey Bader a500a43587 [CodeGen][AMDGPU] Fix ICE for static initializer IR generation
Differential Revision: https://reviews.llvm.org/D92782
2020-12-12 23:26:54 +03:00
Nico Weber a5c65de295 mac/arm: XFAIL the last 3 failing tests
We should fix them, but let's XFAIL them for now so that we can start
running check-clang on bots and lock in the passing tests.

Part of 46644.
2020-12-12 15:09:17 -05:00
Melanie Blower 320af6b138 Create SPIRABIInfo to enable SPIR_FUNC calling convention.
Background: Call to library arithmetic functions for div is emitted by the
compiler and it set wrong “C” calling convention for calls to these functions,
whereas library functions are declared with `spir_function` calling convention.
InstCombine optimization replaces such calls with “unreachable” instruction.
It looks like clang lacks SPIRABIInfo class which should specify default
calling conventions for “system” function calls. SPIR supports only
SPIR_FUNC and SPIR_KERNEL calling convention.

Reviewers: Erich Keane, Anastasia

Differential Revision: https://reviews.llvm.org/D92721
2020-12-12 05:48:20 -08:00
Marco Elver c28b18af19 [KernelAddressSanitizer] Fix globals exclusion for indirect aliases
GlobalAlias::getAliasee() may not always point directly to a
GlobalVariable. In such cases, try to find the canonical GlobalVariable
that the alias refers to.

Link: https://github.com/ClangBuiltLinux/linux/issues/1208

Reviewed By: dvyukov, nickdesaulniers

Differential Revision: https://reviews.llvm.org/D92846
2020-12-11 12:20:40 +01:00
Florian Hahn 9c4cddb53a
[Clang] Add vcmla and rotated variants for Arm ACLE.
This patch adds vcmla and the rotated variants as defined in
"Arm Neon Intrinsics Reference for ACLE Q3 2020" [1]

The *_lane_* are still missing, but they can be added separately.

This patch only adds the builtin mapping for AArch64.

[1] https://developer.arm.com/documentation/ihi0073/latest

Reviewed By: t.p.northover

Differential Revision: https://reviews.llvm.org/D92930
2020-12-10 16:54:08 +00:00
Luo, Yuanke f80b29878b [X86] AMX programming model.
This patch implements amx programming model that discussed in llvm-dev
 (http://lists.llvm.org/pipermail/llvm-dev/2020-August/144302.html).
 Thank Hal for the good suggestion in the RA. The fast RA is not in the patch yet.
 This patch implemeted 7 components.

1. The c interface to end user.
2. The AMX intrinsics in LLVM IR.
3. Transform load/store <256 x i32> to AMX intrinsics or split the
   type into two <128 x i32>.
4. The Lowering from AMX intrinsics to AMX pseudo instruction.
5. Insert psuedo ldtilecfg and build the def-use between ldtilecfg to amx
   intruction.
6. The register allocation for tile register.
7. Morph AMX pseudo instruction to AMX real instruction.

Change-Id: I935e1080916ffcb72af54c2c83faa8b2e97d5cb0

Differential Revision: https://reviews.llvm.org/D87981
2020-12-10 17:01:54 +08:00
Yuanfang Chen fc3942526f [NFCI] Add a missing triple in clang/test/CodeGen/ppc64le-varargs-f128.c 2020-12-09 18:17:34 -08:00
Kevin P. Neal acd4950d4f [FPEnv] Correct constrained metadata in fp16-ops-strict.c
This test shows we're in some cases not getting strictfp information from
the AST. Correct that.

Differential Revision: https://reviews.llvm.org/D92596
2020-12-08 10:18:32 -05:00
Tim Northover c5978f42ec UBSAN: emit distinctive traps
Sometimes people get minimal crash reports after a UBSAN incident. This change
tags each trap with an integer representing the kind of failure encountered,
which can aid in tracking down the root cause of the problem.
2020-12-08 10:28:26 +00:00
Luís Marques 3af354e863 [Clang][CodeGen][RISCV] Fix hard float ABI for struct with empty struct and complex
Fixes bug 44904.

Differential Revision: https://reviews.llvm.org/D91278
2020-12-08 09:19:05 +00:00
Luís Marques fa8f5bfa4e [Clang][CodeGen][RISCV] Fix hard float ABI test cases with empty struct
The code seemed not to account for the field 1 offset.

Differential Revision: https://reviews.llvm.org/D91270
2020-12-08 09:19:05 +00:00
Luís Marques ca93f9abdc [Clang][CodeGen][RISCV] Add hard float ABI tests with empty struct
This patch adds tests that showcase a behavior that is currently buggy.
Fix in a follow-up patch.

Differential Revision: https://reviews.llvm.org/D91269
2020-12-08 09:19:05 +00:00
Qiu Chaofan 5e85a2ba16 [PowerPC] Implement intrinsic for DARN instruction
Instruction darn was introduced in ISA 3.0. It means 'Deliver A Random
Number'. The immediate number L means:

- L=0, the number is 32-bit (higher 32-bits are all-zero)
- L=1, the number is 'conditioned' (processed by hardware to reduce bias)
- L=2, the number is not conditioned, directly from noise source

GCC implements them in three separate intrinsics: __builtin_darn,
__builtin_darn_32 and __builtin_darn_raw. This patch implements the
same intrinsics. And this change also addresses Bugzilla PR39800.

Reviewed By: steven.zhang

Differential Revision: https://reviews.llvm.org/D92465
2020-12-08 14:08:52 +08:00
Jennifer Yu f8d5b49c78 Fix missing error for use of 128-bit integer inside SPIR64 device code.
Emit error for use of 128-bit integer inside device code had been
already implemented in https://reviews.llvm.org/D74387.  However,
the error is not emitted for SPIR64, because for SPIR64, hasInt128Type
return true.

hasInt128Type: is also used to control generation of certain 128-bit
predefined macros, initializer predefined 128-bit integer types and
build 128-bit ArithmeticTypes.  Except predefined macros, only the
device target is considered, since error only emit when 128-bit
integer is used inside device code, the host target (auxtarget) also
needs to be considered.

The change address:
1. (SPIR.h) Correct hasInt128Type() for SPIR targets.
2. Sema.cpp and SemaOverload.cpp: Add additional check to consider host
   target(auxtarget) when call to hasInt128Type.  So that __int128_t
   and __int128() are allowed to avoid error when they used outside
   device code.
3. SemaType.cpp: add check for SYCLIsDevice to delay the error message.
   The error will be emitted if the use of 128-bit integer in the device
   code.

   Reviewed By: Johannes Doerfert and Aaron Ballman

   Differential Revision: https://reviews.llvm.org/D92439
2020-12-07 10:42:32 -08:00