Commit Graph

41502 Commits

Author SHA1 Message Date
Yaxun (Sam) Liu 10eb3bf2d4 Skip -fPIE for AMDGPU and HIP toolchain
AMDGPU toolchain does not support -fPIE, therefore skip it if specified by driver.

Differential Revision: https://reviews.llvm.org/D88425
2020-09-28 22:03:18 -04:00
Richard Smith c375635d05 Ensure that we don't compute linkage for an anonymous class too early if
it has a member whose name is the same as a builtin.

Fixes a regression from the introduction of BuiltinAttr.
2020-09-28 17:22:40 -07:00
Jan Korous 6fd8c69049 [clang] Update warning-wall.c test
Follow-up to 1e86d637eb4f:
[clang] Selectively ena/disa-ble format-insufficient-args warning
2020-09-28 17:19:51 -07:00
Zahira Ammarguellat efd04721c9 BuildVectorType with a dependent (array) type is crashing the compiler - Fix for PR-47542
Reviewed By: rnk

Differential Revision: https://reviews.llvm.org/D88150
2020-09-28 17:10:32 -07:00
Yonghong Song 54d9f743c8 BPF: move AbstractMemberAccess and PreserveDIType passes to EP_EarlyAsPossible
Move abstractMemberAccess and PreserveDIType passes as early as
possible, right after clang code generation.

Currently, compiler may transform the above code
  p1 = llvm.bpf.builtin.preserve.struct.access(base, 0, 0);
  p2 = llvm.bpf.builtin.preserve.struct.access(p1, 1, 2);
  a = llvm.bpf.builtin.preserve_field_info(p2, EXIST);
  if (a) {
    p1 = llvm.bpf.builtin.preserve.struct.access(base, 0, 0);
    p2 = llvm.bpf.builtin.preserve.struct.access(p1, 1, 2);
    bpf_probe_read(buf, buf_size, p2);
  }
to
  p1 = llvm.bpf.builtin.preserve.struct.access(base, 0, 0);
  p2 = llvm.bpf.builtin.preserve.struct.access(p1, 1, 2);
  a = llvm.bpf.builtin.preserve_field_info(p2, EXIST);
  if (a) {
    bpf_probe_read(buf, buf_size, p2);
  }
and eventually assembly code looks like
  reloc_exist = 1;
  reloc_member_offset = 10; //calculate member offset from base
  p2 = base + reloc_member_offset;
  if (reloc_exist) {
    bpf_probe_read(bpf, buf_size, p2);
  }
if during libbpf relocation resolution, reloc_exist is actually
resolved to 0 (not exist), reloc_member_offset relocation cannot
be resolved and will be patched with illegal instruction.
This will cause verifier failure.

This patch attempts to address this issue by do chaining
analysis and replace chains with special globals right
after clang code gen. This will remove the cse possibility
described in the above. The IR typically looks like
  %6 = load @llvm.sk_buff:0:50$0:0:0:2:0
  %7 = bitcast %struct.sk_buff* %2 to i8*
  %8 = getelementptr i8, i8* %7, %6
for a particular address computation relocation.

But this transformation has another consequence, code sinking
may happen like below:
  PHI = <possibly different @preserve_*_access_globals>
  %7 = bitcast %struct.sk_buff* %2 to i8*
  %8 = getelementptr i8, i8* %7, %6

For such cases, we will not able to generate relocations since
multiple relocations are merged into one.

This patch introduced a passthrough builtin
to prevent such optimization. Looks like inline assembly has more
impact for optimizaiton, e.g., inlining. Using passthrough has
less impact on optimizations.

A new IR pass is introduced at the beginning of target-dependent
IR optimization, which does:
  - report fatal error if any reloc global in PHI nodes
  - remove all bpf passthrough builtin functions

Changes for existing CORE tests:
  - for clang tests, add "-Xclang -disable-llvm-passes" flags to
    avoid builtin->reloc_global transformation so the test is still
    able to check correctness for clang generated IR.
  - for llvm CodeGen/BPF tests, add "opt -O2 <ir_file> | llvm-dis" command
    before "llc" command since "opt" is needed to call newly-placed
    builtin->reloc_global transformation. Add target triple in the IR
    file since "opt" requires it.
  - Since target triple is added in IR file, if a test may produce
    different results for different endianness, two tests will be
    created, one for bpfeb and another for bpfel, e.g., some tests
    for relocation of lshift/rshift of bitfields.
  - field-reloc-bitfield-1.ll has different relocations compared to
    old codes. This is because for the structure in the test,
    new code returns struct layout alignment 4 while old code
    is 8. Align 8 is more precise and permits double load. With align 4,
    the new mechanism uses 4-byte load, so generating different
    relocations.
  - test intrinsic-transforms.ll is removed. This is used to test
    cse on intrinsics so we do not lose metadata. Now metadata is attached
    to global and not instruction, it won't get lost with cse.

Differential Revision: https://reviews.llvm.org/D87153
2020-09-28 16:56:22 -07:00
David Tenty ee80615b5c [clang][driver][AIX] Set compiler-rt as default rtlib
Reviewed By: hubert.reinterpretcast

Differential Revision: https://reviews.llvm.org/D88182
2020-09-28 19:45:43 -04:00
Jan Korous 1e86d637eb [clang] Selectively ena/disa-ble format-insufficient-args warning
Differential Revision: https://reviews.llvm.org/D87176
2020-09-28 16:24:50 -07:00
Craig Topper 288c5776c9 [X86] Use inlineasm flag output for the _bittest* intrinsics.
Instead of expliciting emitting a setc in the inline asm instructions,
we can use flag output. This allows the backend to use the flag
directly if it is needed by a branch. Previously we needed a test
instruction to convert the register back to a flag.

If the flag can't be used directly, the backend will emit a setcc.

Differential Revision: https://reviews.llvm.org/D87888
2020-09-28 13:33:22 -07:00
Baptiste Saleil 0156914275 [PowerPC] Legalize v256i1 and v512i1 and implement load and store of these types
This patch legalizes the v256i1 and v512i1 types that will be used for MMA.

It implements loads and stores of these types.
v256i1 is a pair of VSX registers, so for this type, we load/store the two
underlying registers. v512i1 is used for MMA accumulators. So in addition to
loading and storing the 4 associated VSX registers, we generate instructions to
prime (copy the VSX registers to the accumulator) after loading and unprime
(copy the accumulator back to the VSX registers) before storing.

This patch also adds the UACC register class that is necessary to implement the
loads and stores. This class represents accumulator in their unprimed form and
allow the distinction between primed and unprimed accumulators to avoid invalid
copies of the VSX registers associated with primed accumulators.

Differential Revision: https://reviews.llvm.org/D84968
2020-09-28 14:39:37 -05:00
Vedant Kumar 06bc685fa2 [ubsan] nullability-arg: Fix crash on C++ member pointers
Extend -fsanitize=nullability-arg to handle call sites which accept C++
member pointers.

rdar://62476022

Differential Revision: https://reviews.llvm.org/D88336
2020-09-28 09:41:18 -07:00
Michael Liao 5dbf80cad9 [clang][codegen] Annotate `correctly-rounded-divide-sqrt-fp-math` fn-attr for OpenCL only.
- `-cl-fp32-correctly-rounded-divide-sqrt` is an OpenCL-specific option
  and `correctly-rounded-divide-sqrt-fp-math` should be added for OpenCL
  at most.

Differential revision: https://reviews.llvm.org/D88303
2020-09-28 11:40:32 -04:00
Haojian Wu bf890dcb0f [clang] Don't emit "no member" diagnostic if the lookup fails on an invalid record decl.
The "no member" diagnostic is likely bogus.

Reviewed By: sammccall, #libc

Differential Revision: https://reviews.llvm.org/D86765
2020-09-28 15:10:00 +02:00
Richard Smith 9dcd96f728 Canonicalize declaration pointers when forming APValues.
References to different declarations of the same entity aren't different
values, so shouldn't have different representations.

Recommit of e6393ee813 with fixed handling
for weak declarations. We now look for attributes on the most recent
declaration when determining whether a declaration is weak. (Second
recommit with further fixes for mishandling of weak declarations. Our
behavior here is fundamentally unsound -- see PR47663 -- but this
approach attempts to not make things worse.)
2020-09-27 19:05:26 -07:00
Florian Hahn 915310bf14 Revert "[DSE] Switch to MemorySSA-backed DSE by default."
There appears to be a mis-compile with MemorySSA-backed DSE in
combination with llvm.lifetime.end. It currently appears like
DSE is doing the right thing and the llvm.lifetime.end markers
are incorrect. The reverted patch uncovers the mis-compile.

This patch temporarily switches back to the legacy DSE
implementation, while we investigate.

This reverts commit 9d172c8e9c.
2020-09-26 18:35:27 +01:00
Serge Pavlov f91b9c0f98 Run test on particular target only
The test `AST/const-fpfeatures-diag.c` requires setting strict FP
semantics, so it fails on targets where support of such semantic
is limited.
2020-09-26 20:26:34 +07:00
Serge Pavlov 6314f412a8 [FPEnv] Evaluate constant expressions under non-default rounding modes
The change implements evaluation of constant floating point expressions
under non-default rounding modes. The main objective was to support
evaluation of global variable initializers, where constant rounding mode
may be specified by `#pragma STDC FENV_ROUND`.

Differential Revision: https://reviews.llvm.org/D87822
2020-09-26 17:59:39 +07:00
Shilei Tian ebb1092a28 [Clang][OpenMP] Added support for nowait target in CodeGen via regular task
Previously for nowait target, CG emitted a function call to `__tgt_target_nowait`, etc. However, in OpenMP RTL, these functions just directly call the no-nowait version, which means nowait is not working as expected.

OpenMP specification says a target is acutally a target task, which is an untied and detachable task. It is natural to go to the direction that generates a task for a nowait target. However, OpenMP task has a problem that it must be within to a parallel region; otherwise the task will be executed immediately. As a result, if we directly wrap to a regular task, the `target nowait` outside of a parallel region is still a synchronous version.

In D77609, I added the support for unshackled task in OpenMP RTL. Basically, unshackled task is a task that is not bound to any parallel region. So all nowait target will be tranformed into an unshackled task. In order to distinguish from regular task, a new flag bit is set for unshackled task. This flag will be used by RTL for later process.

Since all target tasks are allocated via `__kmpc_omp_target_task_alloc`, and in current `libomptarget`, `__kmpc_omp_target_task_alloc` just calls `__kmpc_omp_task_alloc`. Therefore, we can modify the flag in `__kmpc_omp_target_task_alloc` so that we don't need to modify the FE too much. If users choose to opt out the feature, they just need to use a RTL w/o support of unshackled threads.

As a result, in this patch, the `target nowait` region is simply wrapped into a regular task. Later once we have RTL support for unshackled tasks, the wrapped tasks can be executed by unshackled threads w/o changes in the FE.

Reviewed By: jdoerfert

Differential Revision: https://reviews.llvm.org/D78075
2020-09-25 22:10:36 -04:00
Evandro Menezes a000580a89 [RISCV] Update driver tests
Add the RISC-V Bullet core to the driver tests.
2020-09-25 18:36:53 -05:00
Saleem Abdulrasool 58cdbf518b Sema: add support for `__attribute__((__swift_private__))`
This attribute allows declarations to be restricted to the framework
itself, enabling Swift to remove the declarations when importing
libraries.  This is useful in the case that the functions can be
implemented in a more natural way for Swift.

This is based on the work of the original changes in
8afaf3aad2

Differential Revision: https://reviews.llvm.org/D87720
Reviewed By: Aaron Ballman
2020-09-25 22:33:53 +00:00
Matt Arsenault 55c4ff91bd OpaquePtr: Add type to sret attribute
Make the corresponding change that was made for byval in
b7141207a4. Like byval, this requires a
bulk update of the test IR tests to include the type before this can
be mandatory.
2020-09-25 14:07:30 -04:00
Chris Bowler f330d9f163 [PPC] [AIX] Implement calling convention IR for C99 complex types on AIX
Add AIX calling convention logic to Clang for C99 complex types on AIX

Differential Revision: https://reviews.llvm.org/D88130
2020-09-25 07:43:31 -04:00
Momchil Velikov a88c722e68 [AArch64] PAC/BTI code generation for LLVM generated functions
PAC/BTI-related codegen in the AArch64 backend is controlled by a set
of LLVM IR function attributes, added to the function by Clang, based
on command-line options and GCC-style function attributes. However,
functions, generated in the LLVM middle end (for example,
asan.module.ctor or __llvm_gcov_write_out) do not get any attributes
and the backend incorrectly does not do any PAC/BTI code generation.

This patch record the default state of PAC/BTI codegen in a set of
LLVM IR module-level attributes, based on command-line options:

* "sign-return-address", with non-zero value means generate code to
  sign return addresses (PAC-RET), zero value means disable PAC-RET.

* "sign-return-address-all", with non-zero value means enable PAC-RET
  for all functions, zero value means enable PAC-RET only for
  functions, which spill LR.

* "sign-return-address-with-bkey", with non-zero value means use B-key
  for signing, zero value mean use A-key.

This set of attributes are always added for AArch64 targets (as
opposed, for example, to interpreting a missing attribute as having a
value 0) in order to be able to check for conflicts when combining
module attributed during LTO.

Module-level attributes are overridden by function level attributes.
All the decision making about whether to not to generate PAC and/or
BTI code is factored out into AArch64FunctionInfo, there shouldn't be
any places left, other than AArch64FunctionInfo, which directly
examine PAC/BTI attributes, except AArch64AsmPrinter.cpp, which
is/will-be handled by a separate patch.

Differential Revision: https://reviews.llvm.org/D85649
2020-09-25 11:47:14 +01:00
Chris Bowler 64b8a633a8 [NFC] [PPC] Add PowerPC expected IR tests for C99 complex
Adding this test so that I can extend it in a follow on patch with
expected IR for AIX when I implement complex handling in
AIXABIInfo.

Reviewed By: daltenty, ZarkoCA

Differential Revision: https://reviews.llvm.org/D88105
2020-09-24 23:28:40 -04:00
Ian Levesque 6f7fbdd285 [xray] Function coverage groups
Add the ability to selectively instrument a subset of functions by dividing the functions into N logical groups and then selecting a group to cover. By selecting different groups over time you could cover the entire application incrementally with lower overhead than instrumenting the entire application at once.

Differential Revision: https://reviews.llvm.org/D87953
2020-09-24 22:09:53 -04:00
Richard Smith 8c98c88034 PR47176: Don't read from an inactive union member if a friend function
has default arguments and an exception specification.
2020-09-24 19:02:27 -07:00
Reid Kleckner ecfc9b9712 [MS] For unknown ISAs, pass non-trivially copyable arguments indirectly
Passing them directly is likely to be non-conforming, since it usually
involves copying the bytes of the record. For unknown architectures, we
don't know what MSVC does or will do, but we should at least try to
conform as well as we can.
2020-09-24 16:29:48 -07:00
Reid Kleckner b8a50e9207 [MS] Simplify rules for passing C++ records
Regardless of the target architecture, we should always use the C rules
(RAA_Default) for records that "canBePassedInRegisters". Those are
trivially copyable things, and things marked with [[trivial_abi]].

This should be NFC, although it changes where the final decision about
x86_32 overaligned records is made. The current x86_32 C rules say that
overaligned things are passed indirectly, so there is no functional
difference.
2020-09-24 16:29:47 -07:00
Bill Wendling c9b53b3bf2 Fix regex in test. 2020-09-24 15:21:28 -07:00
Amy Huang c8df781e54 [DebugInfo] Fix bug in constructor homing with classes with trivial
constructors.

This changes the code to avoid using constructor homing for aggregate
classes and classes with trivial default constructors, instead of trying
to loop through the constructors.

Differential Revision: https://reviews.llvm.org/D87808
2020-09-24 14:43:48 -07:00
Bill Wendling f97b68ef4d Fix testcase. 2020-09-24 14:34:28 -07:00
Bill Wendling 34ca5b3392 Remove stale assert.
This is triggered during serialization. The test is for modules, but
will occur for any serialization effort using asm goto.

Reviewed By: nickdesaulniers, jyknight

Differential Revision: https://reviews.llvm.org/D88195
2020-09-24 13:59:42 -07:00
Alexey Bataev 579c42225a [OPENMP]Fix PR47621: Variable used by task inside a template function is not made firstprivate by default
Need to fix a check for the variable if it is declared in the inner
OpenMP region to be able to firstprivatize it.

Reviewed By: jdoerfert

Differential Revision: https://reviews.llvm.org/D88240
2020-09-24 16:18:09 -04:00
Erich Keane 606a734755 [PR47636] Fix tryEmitPrivate to handle non-constantarraytypes
As mentioned in the bug report, tryEmitPrivate chokes on the
MaterializeTemporaryExpr in the reproducers, since it assumes that if
there are elements, than it must be a ConstantArrayType. However, the
MaterializeTemporaryExpr (which matches exactly the AST when it is NOT a
global/static) has an incomplete array type.

This changes the section where the number-of-elements is non-zero to
properly handle non-CAT types by just extracting it as an array type
(since all we needed was the element type out of it).
2020-09-24 12:09:22 -07:00
Alexey Bataev cde7d90cc7 Revert "[OPENMP]Fix PR47621: Variable used by task inside a template function is not made firstprivate by default"
This reverts commit d1419c9fda to fix the
buffer overflow detected by address sanitiizer.
2020-09-24 14:42:04 -04:00
Reid Kleckner 3453b6928d Revert "Recommit "[CUDA][HIP] Defer overloading resolution diagnostics for host device functions""
This reverts commit e39da8ab6a.

This depends on a change that needs additional design review and needs
to be reverted.
2020-09-24 11:16:54 -07:00
Alexey Bataev d1419c9fda [OPENMP]Fix PR47621: Variable used by task inside a template function is not made firstprivate by default
Need to fix a check for the variable if it is declared in the inner
OpenMP region to be able to firstprivatize it.

Differential Revision: https://reviews.llvm.org/D88240
2020-09-24 13:51:21 -04:00
Alexey Bataev a9fca98ee4 [OPENMP]PR47606: Do not update the lastprivate item if it was captured by reference as firstprivate data member.
No need to make final copy from the firsptrivate/lastprivate copy to the original item if the item is a data memeber.
Firstprivate copy creates a copy by reference and the original item gets
updated correctly when updating the lastprivate shared variable.

Differential Revision: https://reviews.llvm.org/D88179
2020-09-24 13:14:13 -04:00
Saleem Abdulrasool 296d8832a3 Sema: add support for `__attribute__((__swift_newtype__))`
Add the `swift_newtype` attribute which allows a type definition to be
imported into Swift as a new type.  The imported type must be either an
enumerated type (enum) or an object type (struct).

This is based on the work of the original changes in
8afaf3aad2

Differential Revision: https://reviews.llvm.org/D87652
Reviewed By: Aaron Ballman
2020-09-24 15:17:35 +00:00
Yaxun (Sam) Liu e39da8ab6a Recommit "[CUDA][HIP] Defer overloading resolution diagnostics for host device functions"
This recommits 7f1f89ec8d and
40df06cdaf after fixing memory
sanitizer failure.
2020-09-24 08:44:37 -04:00
Amy Kwan 6b136b19cb [Power10] Implement custom codegen for the vec_replace_elt and vec_replace_unaligned builtins.
This patch implements custom codegen for the vec_replace_elt and
vec_replace_unaligned builtins.

These builtins map to the @llvm.ppc.altivec.vinsw and @llvm.ppc.altivec.vinsd
intrinsics depending on the arguments. The main motivation for doing custom
codegen for these intrinsics is because there are float and double versions of
the builtin. Normally, the converting the float to an integer would be done via
fptoui in the IR. This is incorrect as fptoui truncates the value and we must
ensure the value is not truncated. Therefore, we provide custom codegen to utilize
bitcast instead as bitcasts do not truncate.

Differential Revision: https://reviews.llvm.org/D83500
2020-09-23 22:55:25 -05:00
Craig Topper d9717d8ee7 [X86] Add a memory clobber to the bittest intrinsic inline asm. Get default clobbers from the target
I believe the inline asm emitted here should have a memory clobber since it writes to memory.

It was also missing the dirflag clobber that we use by default along with flags and fpsr. To avoid missing defaults in the future, get the default list from the target

Differential Revision: https://reviews.llvm.org/D88121
2020-09-23 14:54:39 -07:00
Amy Kwan 2e7117f847 [PowerPC] Implement the 128-bit vec_[all|any]_[eq | ne | lt | gt | le | ge] builtins in Clang/LLVM
This patch implements the vec_[all|any]_[eq | ne | lt | gt | le | ge] builtins for vector signed/unsigned __int128.

Differential Revision: https://reviews.llvm.org/D87910
2020-09-23 16:49:40 -04:00
Albion Fung 88cdbeab41 [PowerPC] Implement Vector signed/unsigned __int128 overloads for the comparison builtins
This patch implements Vector signed/unsigned __int128 overloads for the comparison builtins.

Differential Revision: https://reviews.llvm.org/D87804
2020-09-23 16:49:40 -04:00
Aaron Ballman af1d3e6559 Allow init_priority values <= 100 and > 65535 within system headers.
This also adds some bare-bones documentation for the attribute rather
than leaving it undocumented.
2020-09-23 15:26:50 -04:00
Stanislav Mekhanoshin 59691dc874 [AMDGPU] Make ds fp atomics overloadable
Differential Revision: https://reviews.llvm.org/D87947
2020-09-23 11:39:50 -07:00
Sriraman Tallam 7d0bbe4090 Re-apply https://reviews.llvm.org/D87921, was reverted to triage a PPC bot failure.
D87921 was reverted in commit b89059a313
as it was causing an unknown llvm PPC bot failure.  Reapplying the patch
after confirming that this is not responsible. Build bot failure:
https://reviews.llvm.org/D87921#2286644  which caused the revert.

The wrong placement of add pass with optimizations led to
-funique-internal-linkage-names being disabled.

Fixed the placement of the MPM.addpass for UniqueInternalLinkageNames to make it
work correctly with -O2 and new pass manager. Updated the tests to explicitly
check O0 and O1.

Differential Revision: https://reviews.llvm.org/D87921
2020-09-23 10:28:40 -07:00
Mircea Trofin 271928792e Add REQUIRES to embed-bitcode-noopt.ll 2020-09-23 10:13:09 -07:00
Mircea Trofin 437358be71 [clang]Test ensuring -fembed-bitcode passed to cc1 captures pre-opt bitcode.
This is important to not regress because it allows us to capture pre-optimization
bitcode and options, and replay the full optimization pipeline.

Differential Revision: https://reviews.llvm.org/D88114
2020-09-23 09:35:28 -07:00
Yaxun (Sam) Liu e6d50b4f22 recommit [HIP] Fix -gsplit-dwarf option
recommit e50465ecef with fix for
regression in lldb tests.

Two issues:

1. the directory part of original .dwo file was dropped
2. if the stem of the .dwo file contains '.', the last dot
and strings after that were removed

This recommit fixes those two issues.
2020-09-23 11:20:29 -04:00
Yaxun (Sam) Liu 301e23305d [CUDA][HIP] Fix static device var used by host code only
A static device variable may be accessed in host code through
cudaMemCpyFromSymbol etc. Currently clang does not
emit the static device variable if it is only referenced by
host code, which causes host code to fail at run time.

This patch fixes that.

Differential Revision: https://reviews.llvm.org/D88115
2020-09-23 08:18:19 -04:00