Summary:
- In HIP, just as the regular device-only compilation, the device-only
relocatable code compilation should not involve offload bundle.
- In addition, that device-only relocatable code compilation should have
the similar 3 steps, namely preprocessor, compile, and backend, to the
regular code generation with `-emit-llvm`.
Reviewers: yaxunl, tra
Subscribers: cfe-commits
Tags: #clang
Differential Revision: https://reviews.llvm.org/D81427
Summary:
ROCm.h had been getting the declarations for various data structures
by being #included next to them, rather than #includeing them itself.
This change fixes that by explicitly including the appropriate headers.
Subscribers: cfe-commits
Tags: #clang
Differential Revision: https://reviews.llvm.org/D81432
Summary:
Add -ftrivial-auto-var-init-stop-after= to limit the number of times
stack variables are initialized when -ftrivial-auto-var-init= is used to
initialize stack variables to zero or a pattern. This flag can be used
to bisect uninitialized uses of a stack variable exposed by automatic
variable initialization, such as http://crrev.com/c/2020401.
Reviewers: jfb, vitalybuka, kcc, glider, rsmith, rjmccall, pcc, eugenis, vlad.tsyrklevich
Reviewed By: jfb
Subscribers: phosek, hubert.reinterpretcast, srhines, MaskRay, george.burgess.iv, dexonsmith, inglorion, gbiv, llozano, manojgupta, cfe-commits
Tags: #clang
Differential Revision: https://reviews.llvm.org/D77168
Summary: This patch removes the special handling for Darwin on PowerPC in the default target cpu handling, because Darwin is no longer supported on the PowerPC platform.
Reviewers: hubert.reinterpretcast, daltenty
Reviewed By: hubert.reinterpretcast
Subscribers: wuzish, nemanjai, shchenz, steven.zhang, cfe-commits
Tags: #clang
Differential Revision: https://reviews.llvm.org/D81115
To support std::complex and some other standard C/C++ functions in HIP device code,
they need to be forced to be __host__ __device__ functions by pragmas. This is done
by some clang standard C++ wrapper headers which are shared between cuda-clang and hip-Clang.
For these standard C++ wapper headers to work properly, specific include path order
has to be enforced:
clang C++ wrapper include path
standard C++ include path
clang include path
Also, these C++ wrapper headers require device version of some standard C/C++ functions
must be declared before including them. This needs to be done by including a default
header which declares or defines these device functions. The default header is always
included before any other headers are included by users.
This patch adds the the default header and include path for HIP.
Differential Revision: https://reviews.llvm.org/D81176
Follow the model used on Linux, where the clang driver passes the
linker a -u switch to force the profile runtime to be linked in,
rather than having every TU emit a dead function with a reference.
Differential Revision: https://reviews.llvm.org/D79835
Follow the model used on Linux, where the clang driver passes the
linker a -u switch to force the profile runtime to be linked in,
rather than having every TU emit a dead function with a reference.
Patch By: mcgrathr
Differential Revision: https://reviews.llvm.org/D79835
Summary: This patch changes the AIX default target CPU to power4 since this is the the lowest arch for the lowest OS level supported.
Reviewers: hubert.reinterpretcast, cebowleratibm, daltenty
Reviewed By: hubert.reinterpretcast
Subscribers: cfe-commits
Tags: #clang
Differential Revision: https://reviews.llvm.org/D80835
Summary:
An upgrade of LLVM for CrOS [0] containing [1] triggered a bunch of
errors related to writing to reserved registers for a Linux kernel's
arm64 compat vdso (which is a aarch32 image).
After a discussion on LKML [2], it was determined that
-f{no-}omit-frame-pointer was not being specified. Comparing GCC and
Clang [3], it becomes apparent that GCC defaults to omitting the frame
pointer implicitly when optimizations are enabled, and Clang does not.
ie. setting -O1 (or above) implies -fomit-frame-pointer. Clang was
defaulting to -fno-omit-frame-pointer implicitly unless -fomit-frame-pointer
was set explicitly.
Why this becomes a problem is that the Linux kernel's arm64 compat vdso
contains code that uses r7. r7 is used sometimes for the frame pointer
(for example, when targeting thumb (-mthumb)). See useR7AsFramePointer()
in llvm/llvm-project/llvm/lib/Target/ARM/ARMSubtarget.h. This is mostly
for legacy/compatibility reasons, and the 2019 Q4 revision of the ARM
AAPCS looks to standardize r11 as the frame pointer for aarch32, though
this is not yet implemented in LLVM.
Users that are reliant on the implicit value if unspecified when
optimizations are enabled should explicitly choose -fomit-frame-pointer
(new behavior) or -fno-omit-frame-pointer (old behavior).
[0] https://bugs.chromium.org/p/chromium/issues/detail?id=1084372
[1] https://reviews.llvm.org/D76848
[2] https://lore.kernel.org/lkml/20200526173117.155339-1-ndesaulniers@google.com/
[3] https://godbolt.org/z/0oY39t
Reviewers: kristof.beyls, psmith, danalbert, srhines, MaskRay, ostannard, efriedma
Reviewed By: psmith, danalbert, srhines, MaskRay, efriedma
Subscribers: efriedma, olista01, MaskRay, vhscampos, cfe-commits, llvm-commits, manojgupta, llozano, glider, hctim, eugenis, pcc, peter.smith, srhines
Tags: #clang, #llvm
Differential Revision: https://reviews.llvm.org/D80828
This patch adds clang options:
-fbasic-block-sections={all,<filename>,labels,none} and
-funique-basic-block-section-names.
LLVM Support for basic block sections is already enabled.
+ -fbasic-block-sections={all, <file>, labels, none} : Enables/Disables basic
block sections for all or a subset of basic blocks. "labels" only enables
basic block symbols.
+ -funique-basic-block-section-names: Enables unique section names for
basic block sections, disabled by default.
Differential Revision: https://reviews.llvm.org/D68049
These are mapped in MachO::getMachOArchName already, but were missing
in ToolChain::getDefaultUniversalArchName.
Having these reverse mapped here fixes weird inconsistencies like
-dumpmachine showing a target triple like "aarch64-apple-darwin",
while "clang -target aarch64-apple-darwin" didn't use to work (ended
up mapped as unknown-apple-ios).
Differential Revision: https://reviews.llvm.org/D79117
Summary: Before this patch, we use two different ways to pass options to align branch
depending on whether LTO is enabled. For example, `-mbranches-within-32B-boundaries`
w/o LTO and `-Wl,-plugin-opt=-x86-branches-within-32B-boundaries` w/ LTO. It's
inconvenient, so this patch unifies the way: we only need to pass options like
`-mbranches-within-32B-boundaries` to align branches, no matter LTO is enabled or not.
Differential Revision: https://reviews.llvm.org/D80289
Summary:
This patch simply adds support for the new CPU in anticipation of
Power10. There isn't really any functionality added so there are no
associated test cases at this time.
Reviewers: stefanp, nemanjai, amyk, hfinkel, power-llvm-team, #powerpc
Reviewed By: stefanp, nemanjai, amyk, #powerpc
Subscribers: NeHuang, steven.zhang, hiraditya, llvm-commits, wuzish, shchenz, cfe-commits, kbarton, echristo
Tags: #clang, #powerpc, #llvm
Differential Revision: https://reviews.llvm.org/D80020
Summary:
This patch simply adds support for the new CPU in anticipation of
Power10. There isn't really any functionality added so there are no
associated test cases at this time.
Reviewers: stefanp, nemanjai, amyk, hfinkel, power-llvm-team, #powerpc
Reviewed By: stefanp, nemanjai, amyk, #powerpc
Subscribers: NeHuang, steven.zhang, hiraditya, llvm-commits, wuzish, shchenz, cfe-commits, kbarton, echristo
Tags: #clang, #powerpc, #llvm
Differential Revision: https://reviews.llvm.org/D80020
-fno-semantic-interposition is currently the CC1 default. (The opposite
disables some interprocedural optimizations.) However, it does not infer
dso_local: on most targets accesses to ExternalLinkage functions/variables
defined in the current module still need PLT/GOT.
This patch makes explicit -fno-semantic-interposition infer dso_local,
so that PLT/GOT can be eliminated if targets implement local aliases
for AsmPrinter::getSymbolPreferLocal (currently only x86).
Currently we check whether the module flag "SemanticInterposition" is 0.
If yes, infer dso_local. In the future, we can infer dso_local unless
"SemanticInterposition" is 1: frontends other than clang will also
benefit from the optimization if they don't bother setting the flag.
(There will be risks if they do want ELF interposition: they need to set
"SemanticInterposition" to 1.)
Summary: On AIX, add '-bcdtors:all:0:s' to the linker implicitly through the driver so that we can collect all static constructor and destructor functions.
Reviewers: hubert.reinterpretcast, Xiangling_L, ZarkoCA, daltenty
Reviewed By: hubert.reinterpretcast
Subscribers: cfe-commits
Tags: #clang
Differential Revision: https://reviews.llvm.org/D80415
The various HIP builds are all inconsistent.
The default llvm install goes to ${INSTALL_PREFIX}/bin/clang, but the
rocm packaging scripts move this under
${INSTALL_PREFIX}/llvm/bin/clang. Some other builds further pollute
this with ${INSTALL_PREFIX}/bin/x86_64/clang. These should really be
consolidated, but try to handle them for now.
This is required to get avr-gdb correctly showing values at the right
addresses. This problem was discovered by using debug symbols in an
external program to lookup values in an AVR simulator.
Enables Machine Outlining for ARM and Thumb2 modes. This is the first
patch of the series which adds all the basic logic for the support, and
only handles tail-calls and thunks.
The outliner can be turned on by using clang -moutline option or -mllvm
-enable-machine-outliner one (like AArch64).
Differential Revision: https://reviews.llvm.org/D76066
Commit 73152a2ec2 fixed type checking for
blocks with qualified id parameters. But there are existing APIs in
Apple SDKs relying on the old type checking behavior. Specifically,
these are APIs using NSItemProviderCompletionHandler in
Foundation/NSItemProvider.h. To keep existing code working and to allow
developers to use affected APIs introduce a compatibility mode that
enables the previous and the fixed type checking. This mode is enabled
only on Darwin platforms.
Reviewed By: jyknight, ahatanak
Differential Revision: https://reviews.llvm.org/D79511
Fixes PR42445 (compiler driver options -Os -Oz translate to
-plugin-opt=Os (Oz) which are not recognized by LLVMgold.so or LLD).
The optimization level mapping matches
CompilerInvocation.cpp:getOptimizationLevel() and SpeedLevel of
PassBuilder::OptimizationLevel::O*.
-plugin-opt=O* affects the way we construct regular LTO/ThinLTO pass
manager pipeline.
Reviewed By: pcc
Differential Revision: https://reviews.llvm.org/D79919
-nogpulib makes sense when there is a host (where -nostdlib would
apply) and offload target. Accept nostdlib when there is no offload
target as an alias.
Merge with the new --rocm-path handling used for OpenCL. This looks
for a usable set of device libraries upfront, rather than giving a
generic "no such file or directory error". If any of the required
bitcode libraries are missing, this will now produce a "cannot find
ROCm installation." error. This differs from the existing hip specific
flags by pointing to a rocm root install instead of a single directory
with bitcode files.
This tries to maintain compatibility with the existing the
--hip-device-lib and --hip-device-lib-path flags, as well as the
HIP_DEVICE_LIB_PATH environment variable, or at least the range of
uses with testcases. The existing range of uses and behavior doesn't
entirely make sense to me, so some of the untested edge cases change
behavior. Currently the two path forms seem to have the double purpose
of a search path for an arbitrary --hip-device-lib, and for finding
the stock set of libraries. Since the stock set of libraries This also
changes the behavior when multiple paths are specified, and only takes
the last one (and the environment variable only handles a single
path).
If --hip-device-lib is used, it now only treats --hip-device-lib-path
as the search path for it, and does not attempt to find the rocm
installation. If not, --hip-device-lib-path and the environment
variable are used as the directory to search instead of the rocm root
based path.
This should also automatically fix handling of the options to use
wave64.