- The goal of this patch is improve option compatible with RISCV-V GCC,
-mcpu support on GCC side will sent patch in next few days.
- -mtune only affect the pipeline model and non-arch/extension related
target feature, e.g. instruction fusion; in td file it called
TuneFeatures, which is introduced by X86 back-end[1].
- -mtune accept all valid option for -mcpu and extra alias processor
option, e.g. `generic`, `rocket` and `sifive-7-series`, the purpose is
option compatible with RISCV-V GCC.
- Processor alias for -mtune will resolve according the current target arch,
rv32 or rv64, e.g. `rocket` will resolve to `rocket-rv32` or `rocket-rv64`.
- Interaction between -mcpu and -mtune:
* -mtune has higher priority than -mcpu for pipeline model and
TuneFeatures.
[1] https://reviews.llvm.org/D85165
Reviewed By: luismarques
Differential Revision: https://reviews.llvm.org/D89025
This reverts commits 683b308c07 and
8487bfd4e9.
We will go for a more restricted approach that does not give freedom to
everyone to change ABIs on whichever platform.
See the discussion on https://reviews.llvm.org/D85802.
This implements the flag proposed in RFC http://lists.llvm.org/pipermail/cfe-dev/2020-August/066437.html.
The goal is to add a way to override the default target C++ ABI through
a compiler flag. This makes it easier to test and transition between different
C++ ABIs through compile flags rather than build flags.
In this patch:
- Store `-fc++-abi=` in a LangOpt. This isn't stored in a
CodeGenOpt because there are instances outside of codegen where Clang
needs to know what the ABI is (particularly through
ASTContext::createCXXABI), and we should be able to override the
target default if the flag is provided at that point.
- Expose the existing ABIs in TargetCXXABI as values that can be passed
through this flag.
- Create a .def file for these ABIs to make it easier to check flag
values.
- Add an error for diagnosing bad ABI flag values.
Differential Revision: https://reviews.llvm.org/D85802
clang --target arm-none-eabi --print-libgcc-file-name --rtlib=compiler-rt
used to print `/path/to/lib/clang/version/lib/libclang_rt.builtins-arm.a`
but should print `/path/to/lib/clang/version/lib/baremetal/libclang_rt.builtins-arm.a`.
Similarly, --target armv7m-none-eabi should print libclang_rt.builtins-armv7m.a
This matches the compiler-rt file name used at link time in the
baremetal driver.
Reviewed By: manojgupta
Differential Revision: https://reviews.llvm.org/D89327
Summary:
This patch does the following:
1. Make InitTargetOptionsFromCodeGenFlags() accepts Triple as a
parameter, because some options' default value is triple dependant.
2. DataSections is turned on by default on AIX for llc.
3. Test cases change accordingly because of the default behaviour change.
4. Clang Driver passes in -fdata-sections by default on AIX.
Reviewed By: MaskRay, DiggerLin
Differential Revision: https://reviews.llvm.org/D88737
Currently, Clang looks for libc++ headers alongside the installation
directory of Clang, and it also adds a search path for headers in the
-isysroot. This is problematic if headers are found in both the toolchain
and in the sysroot, since #include_next will end up finding the libc++
headers in the sysroot instead of the intended system headers.
This patch changes the logic such that if the toolchain contains libc++
headers, no C++ header paths are added in the sysroot. However, if the
toolchain does *not* contain libc++ headers, the sysroot is searched as
usual.
This should not be a breaking change, since any code that previously
relied on some libc++ headers being found in the sysroot suffered from
the #include_next issue described above, which renders any libc++ header
basically useless.
Differential Revision: https://reviews.llvm.org/D89001
SUMMARY:
In IBM compiler xlclang , there is an option -fnovisibility which suppresses visibility. For more details see: https://www.ibm.com/support/knowledgecenter/SSGH3R_16.1.0/com.ibm.xlcpp161.aix.doc/compiler_ref/opt_visibility.html.
We need to add the option -mignore-xcoff-visibility for compatibility with the IBM AIX OS (as the option is enabled by default in AIX). With this option llvm does not emit any visibility attribute to ASM or XCOFF object file.
The option only work on the AIX OS, for other non-AIX OS using the option will report an unsupported options error.
In AIX OS:
1.1 the option -mignore-xcoff-visibility is enabled by default , if there is not -fvisibility=* and -mignore-xcoff-visibility explicitly in the clang command .
1.2 if there is -fvisibility=* explicitly but not -mignore-xcoff-visibility explicitly in the clang command. it will generate visibility attributes.
1.3 if there are both -fvisibility=* and -mignore-xcoff-visibility explicitly in the clang command. The option "-mignore-xcoff-visibility" wins , it do not emit the visibility attribute.
The option -mignore-xcoff-visibility has no effect on visibility attribute when compile with -emit-llvm option to generated LLVM IR.
Reviewer: daltenty,Jason Liu
Differential Revision: https://reviews.llvm.org/D87451
Object of class `Command` contains various properties of a command to
execute, but output file was missed from them. This change adds this
property. It is required for reporting consumed time and memory implemented
in D78903 and may be used in other cases too.
Differential Revision: https://reviews.llvm.org/D78902
A lot of our code building with clang-cl.exe using Clang 11 was failing with
the following 2 type of errors:
1. explicit specialization of 'foo' after instantiation
2. no matching function for call to 'bar'
Note that we also use -fdelayed-template-parsing in our builds.
I tried pretty hard to get a small repro for these failures, but couldn't. So
there is some subtle edge case in the -fpch-instantiate-templates feature
introduced by this change: https://reviews.llvm.org/D69585
When I tried turning this off using -fno-pch-instantiate-templates, builds
would silently fail with the same error without any indication that
-fno-pch-instantiate-templates was being ignored by the compiler. Then I
realized this "no" option wasn't actually working when I ran Clang under a
debugger.
Differential revision: https://reviews.llvm.org/D88680
By convention the default output file for -E is "-" (stdout).
This is expected by tools like ccache, which uses output
of -E to determine if a file and its dependence has changed.
Currently clang does not use stdout as default output file for -E
for HIP, which causes ccache not working.
This patch fixes that.
Differential Revision: https://reviews.llvm.org/D88730
Add an option --gpu-instrument-lib= to allow users to specify
an instrument device library. This is for supporting -finstrument
in device code for debugging/profiling tools.
Differential Revision: https://reviews.llvm.org/D88557
This helper method is useful even outside of Gnu toolchains, so move
it to ToolChain so it can be reused in other toolchains such as Fuchsia.
Differential Revision: https://reviews.llvm.org/D88452
AMDGPU toolchain currently only diagnose invalid target ID for OpenCL
source compilation. Invalid target ID is not diagnosed for assembler.
This patch fixes that.
Differential Revision: https://reviews.llvm.org/D88377
Currently CUDA/HIP toolchain uses "unknown" as bound arch
for offload action for fat binary. This causes -mcpu or -march
with "unknown" added in HIPToolChain::TranslateArgs or
CUDAToolChain::TranslateArgs.
This causes issue for https://reviews.llvm.org/D88377 since
HIP toolchain needs to check -mcpu in HIPToolChain::TranslateArgs.
The bound arch of offload action for fat binary is not really
used, therefore set it to CudaArch::UNUSED.
Differential Revision: https://reviews.llvm.org/D88524
To facilitate faster loading of device binaries and share them among processes,
HIP runtime favors their alignment being 4096 bytes. HIP runtime can load
unaligned device binaries, however, aligning them at 4096 bytes results in
faster loading and less shared memory usage.
This patch adds an option -bundle-align to clang-offload-bundler which allows
bundles to be aligned at specified alignment. By default it is 1, which is NFC
compared to existing format.
This patch then aligns embedded fat binary and device binary inside fat binary
at 4096 bytes.
It has been verified this change does not cause significant overall file size increase
for typical HIP applications (less than 1%).
Differential Revision: https://reviews.llvm.org/D88734
This helper method is useful even outside of Gnu toolchains, so move
it to ToolChain so it can be reused in other toolchains such as Fuchsia.
Differential Revision: https://reviews.llvm.org/D88452
This adds support for -mcpu=cortex-r82. Some more information about this
core can be found here:
https://www.arm.com/products/silicon-ip-cpu/cortex-r/cortex-r82
One note about the system register: that is a bit of a refactoring because of
small differences between v8.4-A AArch64 and v8-R AArch64.
This is based on patches from Mark Murray and Mikhail Maltsev.
Differential Revision: https://reviews.llvm.org/D88660
since that is the normal behaviour of other compilers on the platform.
Reviewed By: hubert.reinterpretcast
Differential Revision: https://reviews.llvm.org/D88500
GCC 7 introduced -fprofile-update={atomic,prefer-atomic} (prefer-atomic is for
best efforts (some targets do not support atomics)) to increment counters
atomically, which is exactly what we have done with -fprofile-instr-generate
(D50867) and -fprofile-arcs (b5ef137c11).
This patch adds the option to clang to surface the internal options at driver level.
GCC 7 also turned on -fprofile-update=prefer-atomic when -pthread is specified,
but it has performance regression
(https://gcc.gnu.org/bugzilla/show_bug.cgi?id=89307). So we don't follow suit.
Differential Revision: https://reviews.llvm.org/D87737
Add the ability to selectively instrument a subset of functions by dividing the functions into N logical groups and then selecting a group to cover. By selecting different groups over time you could cover the entire application incrementally with lower overhead than instrumenting the entire application at once.
Differential Revision: https://reviews.llvm.org/D87953
Check whether /etc/env.d/gcc exists before trying to read from any
file from there. This saves a few OS calls on a non-Gentoo system.
Differential Revision: https://reviews.llvm.org/D87143
recommit e50465ecef with fix for
regression in lldb tests.
Two issues:
1. the directory part of original .dwo file was dropped
2. if the stem of the .dwo file contains '.', the last dot
and strings after that were removed
This recommit fixes those two issues.
since crti is required for functional static initialization.
Reviewed By: hubert.reinterpretcast
Differential Revision: https://reviews.llvm.org/D87927
Set the default wchar_t type on z/OS, and unsigned as the default.
Reviewed By: hubert.reinterpretcast, fanbo-meng
Differential Revision: https://reviews.llvm.org/D87624
When cross compiling with clang-cl, clang splits the INCLUDE env
variable around semicolons (clang/lib/Driver/ToolChains/MSVC.cpp,
MSVCToolChain::AddClangSystemIncludeArgs) and lld splits the
LIB variable similarly (lld/COFF/Driver.cpp,
LinkerDriver::addLibSearchPaths). Therefore, the consensus for
cross compilation with clang-cl and lld-link seems to be to use
semicolons, despite path lists normally being separated by colons
on unix and EnvPathSeparator being set to that.
Therefore, handle the LIB variable similarly in Clang, when
handling lib file arguments when driving linking via Clang.
This fixes commands like "clang-cl test.c -Fetest.exe kernel32.lib" in
a cross compilation setting. Normally, most users call (lld-)link
directly, but meson happens to use this command syntax for
has_function() tests.
Reapply: Change Program.h to define procid_t as ::pid_t. When included
in lldb/unittests/Host/NativeProcessProtocolTest.cpp, it is included
after an lldb namespace containing an lldb::pid_t typedef, followed
later by a "using namespace lldb;". Previously, Program.h wasn't
included in this translation unit, but now it ends up included
transitively from Process.h.
Differential Revision: https://reviews.llvm.org/D88002
This reverts commit 4d85444b31.
This commit broke building lldb's NativeProcessProtocolTest.cpp,
with errors like these:
In file included from include/llvm/Support/Process.h:32:0,
from tools/lldb/unittests/Host/NativeProcessProtocolTest.cpp:12:
include/llvm/Support/Program.h:39:11: error: reference to ‘pid_t’ is ambiguous
typedef pid_t procid_t;
/usr/include/sched.h:38:17: note: candidates are: typedef __pid_t pid_t
typedef __pid_t pid_t;
tools/lldb/include/lldb/lldb-types.h:85:18: note: typedef uint64_t lldb::pid_t
typedef uint64_t pid_t;
When cross compiling with clang-cl, clang splits the INCLUDE env
variable around semicolons (clang/lib/Driver/ToolChains/MSVC.cpp,
MSVCToolChain::AddClangSystemIncludeArgs) and lld splits the
LIB variable similarly (lld/COFF/Driver.cpp,
LinkerDriver::addLibSearchPaths). Therefore, the consensus for
cross compilation with clang-cl and lld-link seems to be to use
semicolons, despite path lists normally being separated by colons
on unix and EnvPathSeparator being set to that.
Therefore, handle the LIB variable similarly in Clang, when
handling lib file arguments when driving linking via Clang.
This fixes commands like "clang-cl test.c -Fetest.exe kernel32.lib" in
a cross compilation setting. Normally, most users call (lld-)link
directly, but meson happens to use this command syntax for
has_function() tests.
Differential Revision: https://reviews.llvm.org/D88002
GCC 8 changed behaviour wrt this, and made it consistent for cross
compilation cases. While it's a change, it's a more sensible behaviour
going forward.
Differential Revision: https://reviews.llvm.org/D88005
when -gsplit option is used with clang driver, clang driver will create
a filename with .dwo option based on the input file name and pass
it to clang -cc1. This file is used for storing the debug info. Since
HIP generate separate object files for different GPU arch's,
this file should be different for different GPU arch. This patch
adds _ and GPU arch to the stem of the dwo file.
Differential Revision: https://reviews.llvm.org/D87791
Enforcing a profile available check in the driver does not work with
incremental LTO builds where the LTO backend invocation does not include
the profile flags. At this point the profiles have already been consumed
and the IR contains profile metadata. Instead we always pass through the
-fsplit-machine-functions flag on user request. The pass itself contains
a check to return early if no profile information is available.
Differential Revision: https://reviews.llvm.org/D87943
Currenlty assume x18 is used as pointer to shadow call stack. User shall pass
flags:
"-fsanitize=shadow-call-stack -ffixed-x18"
Runtime supported is needed to setup x18.
If SCS is desired, all parts of the program should be built with -ffixed-x18 to
maintain inter-operatability.
There's no particuluar reason that we must use x18 as SCS pointer. Any register
may be used, as long as it does not have designated purpose already, like RA or
passing call arguments.
Differential Revision: https://reviews.llvm.org/D84414
Initial support for dwarf fission sections (-gsplit-dwarf) on wasm.
The most interesting change is support for writing 2 files (.o and .dwo) in the
wasm object writer. My approach moves object-writing logic into its own function
and calls it twice, swapping out the endian::Writer (W) in between calls.
It also splits the import-preparation step into its own function (and skips it when writing a dwo).
Differential Revision: https://reviews.llvm.org/D85685
In CUDA/HIP a function may become implicit host device function by
pragma or constexpr. A host device function is checked in both
host and device compilation. However it may be emitted only
on host or device side, therefore the diagnostics should be
deferred until it is known to be emitted.
Currently clang is only able to defer certain diagnostics. This causes
false alarms and limits the usefulness of host device functions.
This patch lets clang defer all overloading resolution diagnostics for host device functions.
An option -fgpu-defer-diag is added to control this behavior. By default
it is off.
It is NFC for other languages.
Differential Revision: https://reviews.llvm.org/D84364
Writing the .note.gnu.property manually is error prone and hard to
maintain in the assembly files.
The -mmark-bti-property is for the assembler to emit the section with the
GNU_PROPERTY_AARCH64_FEATURE_1_BTI. To be used when C/C++ is compiled
with -mbranch-protection=bti.
This patch refactors the .note.gnu.property handling.
Reviewed By: chill, nickdesaulniers
Differential Revision: https://reviews.llvm.org/D81930
Reland with test dependency on aarch64 target.
Writing the .note.gnu.property manually is error prone and hard to
maintain in the assembly files.
The -mmark-bti-property is for the assembler to emit the section with the
GNU_PROPERTY_AARCH64_FEATURE_1_BTI. To be used when C/C++ is compiled
with -mbranch-protection=bti.
This patch refactors the .note.gnu.property handling.
Reviewed By: chill, nickdesaulniers
Differential Revision: https://reviews.llvm.org/D81930
Aligned allocation is not supported on z/OS. This patch sets -faligned-alloc-unavailable as default in z/OS toolchain.
Reviewed By: abhina.sreeskantharajan, hubert.reinterpretcast
Differential Revision: https://reviews.llvm.org/D87611
This patch adds a command line flag for the machine function splitter
(added in rG94faadaca4e1).
-fsplit-machine-functions
Split machine functions using profile information (x86 ELF). On
other targets an error is emitted. If profile information is not
provided a warning is emitted notifying the user that profile
information is required.
Differential Revision: https://reviews.llvm.org/D87047
This is consistent with the clang option added in
7ed8124d46, and the comments on the
runtime patch in D87120.
Differential Revision: https://reviews.llvm.org/D87622
Also add the +mutable-globals features in clang when
building with `-fPIC` since the linker will generate mutable
globals imports and exports in that case.
Differential Revision: https://reviews.llvm.org/D87537
gcc translates -gz=zlib to --compress-debug-options=zlib for both assembler and linker
but clang only does this for assembler.
The linker needs --compress-debug-options=zlib option to compress the debug sections
in the generated executable or shared library.
Due to this bug, -gz=zlib has no effect on the generated executable or shared library.
This patch fixes that.
Differential Revision: https://reviews.llvm.org/D87321
Summary:
This is the first patch implementing the new Flang driver as outlined in [1],
[2] & [3]. It creates Flang driver (`flang-new`) and Flang frontend driver
(`flang-new -fc1`). These will be renamed as `flang` and `flang -fc1` once the
current Flang throwaway driver, `flang`, can be replaced with `flang-new`.
Currently only 2 options are supported: `-help` and `--version`.
`flang-new` is implemented in terms of libclangDriver, defaulting the driver
mode to `FlangMode` (added to libclangDriver in [4]). This ensures that the
driver runs in Flang mode regardless of the name of the binary inferred from
argv[0].
The design of the new Flang compiler and frontend drivers is inspired by it
counterparts in Clang [3]. Currently, the new Flang compiler and frontend
drivers re-use Clang libraries: clangBasic, clangDriver and clangFrontend.
To identify Flang options, this patch adds FlangOption/FC1Option enums.
Driver::printHelp is updated so that `flang-new` prints only Flang options.
The new Flang driver is disabled by default. To enable it, set
`-DBUILD_FLANG_NEW_DRIVER=ON` when configuring CMake and add clang to
`LLVM_ENABLE_PROJECTS` (e.g. -DLLVM_ENABLE_PROJECTS=“clang;flang;mlir”).
[1] “RFC: new Flang driver - next steps”
http://lists.llvm.org/pipermail/flang-dev/2020-July/000470.html
[2] “RFC: Adding a fortran mode to the clang driver for flang”
http://lists.llvm.org/pipermail/cfe-dev/2019-June/062669.html
[3] “RFC: refactoring libclangDriver/libclangFrontend to share with Flang”
http://lists.llvm.org/pipermail/cfe-dev/2020-July/066393.html
[4] https://reviews.llvm.org/rG6bf55804924d5a1d902925ad080b1a2b57c5c75c
co-authored-by: Andrzej Warzynski <andrzej.warzynski@arm.com>
Reviewed By: richard.barton.arm, sameeranjoshi
Differential Revision: https://reviews.llvm.org/D86089
As reported in Bug 42535, `clang` doesn't inline atomic ops on 32-bit
Sparc, unlike `gcc` on Solaris. In a 1-stage build with `gcc`, only two
testcases are affected (currently `XFAIL`ed), while in a 2-stage build more
than 100 tests `FAIL` due to this issue.
The reason for this `gcc`/`clang` difference is that `gcc` on 32-bit
Solaris/SPARC defaults to `-mpcu=v9` where atomic ops are supported, unlike
with `clang`'s default of `-mcpu=v8`. This patch changes `clang` to use
`-mcpu=v9` on 32-bit Solaris/SPARC, too.
Doing so uncovered two bugs:
`clang -m32 -mcpu=v9` chokes with any Solaris system headers included:
/usr/include/sys/isa_defs.h:461:2: error: "Both _ILP32 and _LP64 are defined"
#error "Both _ILP32 and _LP64 are defined"
While `clang` currently defines `__sparcv9` in a 32-bit `-mcpu=v9`
compilation, neither `gcc` nor Studio `cc` do. In fact, the Studio 12.6
`cc(1)` man page clearly states:
These predefinitions are valid in all modes:
[...]
__sparcv8 (SPARC)
__sparcv9 (SPARC -m64)
At the same time, the patch defines `__GCC_HAVE_SYNC_COMPARE_AND_SWAP_[1248]`
for a 32-bit Sparc compilation with any V9 cpu. I've also changed
`MaxAtomicInlineWidth` for V9, matching what `gcc` does and the Oracle
Developer Studio 12.6: C User's Guide documents (Ch. 3, Support for Atomic
Types, 3.1 Size and Alignment of Atomic C Types).
The two testcases that had been `XFAIL`ed for Bug 42535 are un-`XFAIL`ed
again.
Tested on `sparcv9-sun-solaris2.11` and `amd64-pc-solaris2.11`.
Differential Revision: https://reviews.llvm.org/D86621
Currently AMDGPU does not support sanitizer. Disable
sanitizer options for now until they are supported.
Differential Revision: https://reviews.llvm.org/D87461
Basic block sections is untested on other platforms and binary formats apart
from x86,elf. This patch emits a warning and drops the flag if the platform
and binary format are not compatible. Add a test to ensure that
specifying an incompatible target in the driver does not enable the
feature.
Differential Revision: https://reviews.llvm.org/D87426
This effectively disables r340386 on Darwin, and provides a command line flag
to opt into/out of this behaviour. This change is needed to compile certain
Apple headers correctly.
rdar://47688592
Differential revision: https://reviews.llvm.org/D86881
For the PS4, do not emit "-tune-cpu generic" since the platform only has 1 known CPU and we do not want to prevent optimizations by tuning for a generic rather than the specific processor it contains.
Reviewed By: probinson
Differential Revision: https://reviews.llvm.org/D86965
This patch adds the initial toolchain for z/OS that will set some defaults. In subsequent patches, we plan to add support to use the system linker and assembler.
Reviewed By: hubert.reinterpretcast
Differential Revision: https://reviews.llvm.org/D86707
This CL modifies clang enabling using -fsanitize=thread on fuchsia. The
change doesn't build the runtime for fuchsia, it just enables the
instrumentation to be used.
pair-programmed-with: mdempsky@google.com
Change-Id: I816c4d240d1f15e9eae2803fb8ba3a7bf667ed51
Reviewed By: mcgrathr, phosek
Differential Revision: https://reviews.llvm.org/D86822
It's not undefined behavior for an unsigned left shift to overflow (i.e. to
shift bits out), but it has been the source of bugs and exploits in certain
codebases in the past. As we do in other parts of UBSan, this patch adds a
dynamic checker which acts beyond UBSan and checks other sources of errors. The
option is enabled as part of -fsanitize=integer.
The flag is named: -fsanitize=unsigned-shift-base
This matches shift-base and shift-exponent flags.
<rdar://problem/46129047>
Differential Revision: https://reviews.llvm.org/D86000
See RFC for background:
http://lists.llvm.org/pipermail/llvm-dev/2020-June/142744.html
Note that the runtime changes will be sent separately (hopefully this
week, need to add some tests).
This patch includes the LLVM pass to instrument memory accesses with
either inline sequences to increment the access count in the shadow
location, or alternatively to call into the runtime. It also changes
calls to memset/memcpy/memmove to the equivalent runtime version.
The pass is modeled on the address sanitizer pass.
The clang changes add the driver option to invoke the new pass, and to
link with the upcoming heap profiling runtime libraries.
Currently there is no attempt to optimize the instrumentation, e.g. to
aggregate updates to the same memory allocation. That will be
implemented as follow on work.
Differential Revision: https://reviews.llvm.org/D85948
This patch defaults to -mtune=generic unless -march is present. If -march is present we'll use the empty string unless its overridden by mtune. The back should use the target cpu if the tune-cpu isn't present.
It also adds AST serialization support to fix some tests that emit AST and parse it back. These tests diff the IR against the output from not going through AST. So if we don't serialize the tune CPU we fail the diff.
Differential Revision: https://reviews.llvm.org/D86488
The non-standard header file `<sysexits.h>` provides some return values.
`EX_IOERR` is used to as a special value to signal a broken pipe to the clang driver.
On z/OS Unix System Services, this header file does not exists. This patch
- adds a check for `<sysexits.h>`, removing the dependency on `LLVM_ON_UNIX`
- adds a new header file `llvm/Support/ExitCodes`, which either includes
`<sysexits.h>` or defines `EX_IOERR`
- updates the users of `EX_IOERR` to include the new header file
Reviewed By: hubert.reinterpretcast
Differential Revision: https://reviews.llvm.org/D83472
Add an option to directly specify where the msvc toolchain lives for
clang-cl and avoid unwanted file and registry probes.
Differential revision: https://reviews.llvm.org/D85998
If not overridden, AddClangSystemIncludeArgs's implementation is empty, so by
default, no system include args are added to the Clang driver. This means that
invoking Clang without the frontend must include a manual -I/usr/include flag,
which is inconsistent behavior. Therefore, override and implement this method
to match. Some boilerplate is also borrowed for handling of the other driver
flags.
While we are here, also override and enable HasNativeLLVMSupport.
Patch by: 3405691582 (dana koch)
Differential Revision: https://reviews.llvm.org/D86412
Some code bases out there pass -mtune=generic to clang. This would have
been ignored prior to D85384. Now it results in an error
because "generic" isn't recognized by isValidCPUName.
And if we let it go through to the backend as a tune
setting it would get the tune flags closer to i386 rather
than a modern CPU.
I plan to change what tune=generic does in the backend in
a future patch. And allow this in the frontend.
But this should be a quick fix for the error some users
are seeing.
mtune was previously ignored by the compiler so I'm not sure this
did anything. But after D85384 we're starting to support mtune
and this code is now causing a couple test failures on MacOS.
Building on the backend support from D85165. This parses the command line option in the driver, passes it on to CC1 and adds a function attribute.
-Still need to support tune on the target attribute.
-Need to use "generic" as the tuning by default. But need to change generic in the backend first.
-Need to set tune if march is specified and mtune isn't.
-May need to disable getHostCPUName's ability to guess CPU name from features when it doesn't have a family/model match for mtune=native. That's what gcc appears to do.
Differential Revision: https://reviews.llvm.org/D85384
`clang` currently requires the native linker on Solaris:
- It passes `-C` to `ld` which GNU `ld` doesn't understand.
- To use `gld`, one needs to pass the correct `-m EMU` option to select
the right emulation. Solaris `ld` cannot handle that option.
So far I've worked around this by passing `-DCLANG_DEFAULT_LINKER=/usr/bin/ld`
to `cmake`. However, if someone forgets this, it depends on the user's
`PATH` whether or not `clang` finds the correct linker, which doesn't make
for a good user experience.
While it would be nice to detect the linker flavor at runtime, this is more
involved. Instead, this patch defaults to `/usr/bin/ld` on Solaris. This
doesn't work on its own, however: a link fails with
clang-12: error: unable to execute command: Executable "x86_64-pc-solaris2.11-/usr/bin/ld" doesn't exist!
I avoid this by leaving absolute paths alone in `ToolChain::GetLinkerPath`.
Tested on `amd64-pc-solaris2.11`, `sparcv9-sun-solaris2.11`, and
`x86_64-pc-linux-gnu`.
Differential Revision: https://reviews.llvm.org/D84029
Fixes pr/11710.
Signed-off-by: Nick Desaulniers <ndesaulniers@google.com>
Resubmit after breaking Windows and OSX builds.
Reviewed By: dblaikie
Differential Revision: https://reviews.llvm.org/D80242
Date: Mon Aug 10 10:31:50 2020 +0300
[AIX][Clang][Driver] Generate reference to the C++ library on the link step
Have the linker find libc++ on its search path by adding -lc++.
Reviewed by: daltenty, hubert.reinterpretcast, stevewan
Differential Revision: https://reviews.llvm.org/D85315
We had a conversion from const char * to StringRef and const char *
to std::string conversion. These both do their own
strlen call if the compiler doens't figure out how to share them.
By adding the temporary StringRef we can convert it to std::string
instead.
The other case is to use a StringSwitch<StringRef> instead of
StringSwitch<const char *> since the output values of the switch
are string literals. This allows the length to be computed at
compile time. Otherwise we have to convert from const char *
to std::string after the StringSwitch.
I believe this function used to be called directly from X86
specific code and was used to immediately create -target-cpu
command line. A later refactoring changed it to to be called from
a generic getCPU function that returns std::string. So on some
paths we created a string using MakeArgString converted that to
std::string then called MakeArgString again from that.
Instead just return std::string directly like the other targets.
Instead of accepting the same arguments as regular linker,
the static linker will only accept input files.
Reviewed By: yaxunl
Differential Revision: https://reviews.llvm.org/D85442
`ninja check-all` currently fails on Illumos:
[84/716] Generating default/Asan-i386-inline-Test
FAILED: projects/compiler-rt/lib/asan/tests/default/Asan-i386-inline-Test
cd /var/llvm/dist-amd64-release/projects/compiler-rt/lib/asan/tests && /var/llvm/dist-amd64-release/./bin/clang ASAN_INST_TEST_OBJECTS.gtest-all.cc.i386-inline.o ASAN_INST_TEST_OBJECTS.asan_globals_test.cpp.i386-inline.o ASAN_INST_TEST_OBJECTS.asan_interface_test.cpp.i386-inline.o ASAN_INST_TEST_OBJECTS.asan_internal_interface_test.cpp.i386-inline.o ASAN_INST_TEST_OBJECTS.asan_test.cpp.i386-inline.o ASAN_INST_TEST_OBJECTS.asan_oob_test.cpp.i386-inline.o ASAN_INST_TEST_OBJECTS.asan_mem_test.cpp.i386-inline.o ASAN_INST_TEST_OBJECTS.asan_str_test.cpp.i386-inline.o ASAN_INST_TEST_OBJECTS.asan_test_main.cpp.i386-inline.o -o /var/llvm/dist-amd64-release/projects/compiler-rt/lib/asan/tests/default/./Asan-i386-inline-Test -g --driver-mode=g++ -fsanitize=address -m32
ld: fatal: unrecognized option '--no-as-needed'
ld: fatal: use the -z help option for usage information
clang-11: error: linker command failed with exit code 1 (use -v to see invocation)
`clang` unconditionally passes `--as-needed`/`--no-as-needed` to the
linker. This works on Solaris 11.[34] which added a couple of option
aliases to the native linker to improve compatibility with GNU `ld`.
Illumos `ld` didn't do this, so one needs to use the corresponding
native options `-z ignore`/`-z record` instead.
Because this works on both Solaris and Illumos, the current patch always
passes the native options on Solaris. This isn't fully correct, however:
when using GNU `ld` on Solaris (not yet supported; I'm working on that),
one still needs `--as-needed` instead.
I'm hardcoding this decision because a generic detection via a `cmake` test
is hard: many systems have their own implementation of `getDefaultLinker`
and `cmake` would have to duplicate the information encoded there.
Besides, it would still break when `-fuse-ld` is used.
Tested on `amd64-pc-solaris2.11` (Solaris 11.4 and OpenIndiana 2020.04),
`sparcv9-sun-solaris2.11`, and `x86_64-pc-linux-gnu`.
Differential Revision: https://reviews.llvm.org/D84412
Once available in the relevant toolchains this will allow us to implement
LLVM_EXTERNALIZE_DEBUGINFO_OUTPUT_DIR after D84127 by directly placing the dSYM
in the desired location instead of emitting next to the output file and moving
it.
Reviewed By: JDevlieghere
Differential Revision: https://reviews.llvm.org/D84572
for device simulators
This change separates out the iOS/tvOS/watchOS simulator slices from the "libclang_rt.<os>.a"
fat archive, by moving them out to their own "libclang_rt.<os>sim.a" static archive.
This allows us to build and to link with an arm64 device simulator slice for the simulators running
on Apple Silicons, and to distribute it in one archive alongside the Intel simulator slices.
Differential Revision: https://reviews.llvm.org/D84564
A list of target features is disabled when there is no hardware
floating-point support. This is the case when one of the following
options is passed to clang:
- -mfloat-abi=soft
- -mfpu=none
This option list is missing, however, the extension "+nofp" that can be
specified in -march flags, such as "-march=armv8-a+nofp".
This patch also disables unsupported target features when nofp is passed
to -march.
Differential Revision: https://reviews.llvm.org/D82948
Summary: This patch disables implicit builtin knowledge about memcmp-like functions when compiling the program for fuzzing, i.e., when -fsanitize=fuzzer(-no-link) is given. This allows libFuzzer to always intercept memcmp-like functions as it effectively disables optimizing calls to such functions into different forms. This is done by adding a set of flags (-fno-builtin-memcmp and others) in the clang driver. Individual -fno-builtin-* flags previously used in several libFuzzer tests are now removed, as it is now done automatically in the clang driver.
The patch was once reverted in 8ef9e2bf35, as this patch was dependent on a reverted commit f78d9fceea. This reverted commit was recommitted in 831ae45e3d, so relanding this dependent patch too.
Reviewers: morehouse, hctim
Subscribers: cfe-commits, #sanitizers
Tags: #clang, #sanitizers
Differential Revision: https://reviews.llvm.org/D83987
Many driver options are neither 'DriverOption' nor 'LinkerInput'. When gcc is
used for linking, these options get forwarded even if they don't have anything
to do with linking. Among these options, clang-specific ones can cause gcc to
error.
Just use 'OPT_Link_Group' and a new flag 'LinkOption' for options which already
have a group.
gfortran support apparently bit rots (which does not seem to make much sense). XFAIL the test.
SUMMARY:
since we add .extern directive for external symbol, the -u option for aix as do not need any more.
Reviewers: Jason liu
Differential Revision: https://reviews.llvm.org/D84356
Summary: libFuzzer intercepts certain library functions such as memcmp/strcmp by defining weak hooks. Weak hooks, however, are called only when other runtimes such as ASan is linked. This patch defines libFuzzer's own interceptors, which is linked into the libFuzzer executable when other runtimes are not linked, i.e., when -fsanitize=fuzzer is given, but not others.
The patch once landed but was reverted in 8ef9e2bf35 due to an assertion failure caused by calling an intercepted function, strncmp, while initializing the interceptors in fuzzerInit(). This issue is now fixed by calling libFuzzer's own implementation of library functions (i.e., internal_*) when the fuzzer has not been initialized yet, instead of recursively calling fuzzerInit() again.
Reviewers: kcc, morehouse, hctim
Subscribers: #sanitizers, krytarowski, mgorny, cfe-commits
Tags: #clang, #sanitizers
Differential Revision: https://reviews.llvm.org/D83494
Using -fmodules-* options for PCHs is a bit confusing, so add -fpch-*
variants. Having extra options also makes it simple to do a configure
check for the feature.
Also document the options in the release notes.
Differential Revision: https://reviews.llvm.org/D83623
This way should be the same like with a.pcm for modules.
An alternative way is 'clang++ -c empty.cpp -include-pch a.pch -o a.o
-Xclang -building-pch-with-obj', which is what clang-cl's /Yc does
internally.
Differential Revision: https://reviews.llvm.org/D83716
Supersedes D80225. Add --ld-path= to avoid strange target specific
prefixes and make -fuse-ld= focus on its intended job: "linker flavor".
(-f* affects generated code or language features. --ld-path does not
affect codegen, so it is not named -f*)
The way --ld-path= works is similar to "Command Search and Execution" in POSIX.1-2017 2.9.1 Simple Commands.
If --ld-path= specifies
* an absolute path, the value specifies the linker.
* a relative path without a path component separator (/), the value is searched using the -B, COMPILER_PATH, then PATH.
* a relative path with a path component separator, the linker is found relative to the current working directory.
-fuse-ld= and --ld-path= can be composed, e.g. `-fuse-ld=lld --ld-path=/usr/bin/ld.lld`
The driver can base its linker option decision on the flavor -fuse-ld=, but it should not do fragile
flavor checking with --ld-path=.
Reviewed By: whitequark, keith
Differential Revision: https://reviews.llvm.org/D83015
No real action is taken for a value of scalable but it provides a
route to disable an earlier specification and is effectively its
default value when omitted.
Patch also removes an "unused variable" warning.
Differential Revision: https://reviews.llvm.org/D84021
To match GCC (either crossing or not), which doesn't prepend target triple prefixes to `exec_prefixes`.
As an example, powerpc64le-linux-gnu-gcc does not search "powerpc64le-linux-gnu-${name}" in a -B path.
GCC r187297 (2012-05) introduced `__gcov_dump` and `__gcov_reset`.
`__gcov_flush = __gcov_dump + __gcov_reset`
The resolution to https://gcc.gnu.org/PR93623 ("No need to dump gcdas when forking" target GCC 11.0) removed the unuseful and undocumented __gcov_flush.
Close PR38064.
Reviewed By: calixte, serge-sans-paille
Differential Revision: https://reviews.llvm.org/D83149
Summary:
This patch implements parsing support for the 'arm_sve_vector_bits' type
attribute, defined by the Arm C Language Extensions (ACLE, version 00bet5,
section 3.7.3) for SVE [1].
The purpose of this attribute is to define fixed-length (VLST) versions
of existing sizeless types (VLAT). For example:
#if __ARM_FEATURE_SVE_BITS==512
typedef svint32_t fixed_svint32_t __attribute__((arm_sve_vector_bits(512)));
#endif
Creates a type 'fixed_svint32_t' that is a fixed-length version of
'svint32_t' that is normal-sized (rather than sizeless) and contains
exactly 512 bits. Unlike 'svint32_t', this type can be used in places
such as structs and arrays where sizeless types can't.
Implemented in this patch is the following:
* Defined and tested attribute taking single argument.
* Checks the argument is an integer constant expression.
* Attribute can only be attached to a single SVE vector or predicate
type, excluding tuple types such as svint32x4_t.
* Added the `-msve-vector-bits=<bits>` flag. When specified the
`__ARM_FEATURE_SVE_BITS__EXPERIMENTAL` macro is defined.
* Added a language option to store the vector size specified by the
`-msve-vector-bits=<bits>` flag. This is used to validate `N ==
__ARM_FEATURE_SVE_BITS`, where N is the number of bits passed to the
attribute and `__ARM_FEATURE_SVE_BITS` is the feature macro defined under
the same flag.
The `__ARM_FEATURE_SVE_BITS` macro will be made non-experimental in the final
patch of the series.
[1] https://developer.arm.com/documentation/100987/latest
This is patch 1/4 of a patch series.
Reviewers: sdesmalen, rsandifo-arm, efriedma, ctetreau, cameron.mcinally, rengolin, aaron.ballman
Reviewed By: sdesmalen, aaron.ballman
Differential Revision: https://reviews.llvm.org/D83550
This causes binaries linked with this runtime to crash on startup if
dlsym uses any of the intercepted functions. (For example, that happens
when using tcmalloc as the allocator: dlsym attempts to allocate memory
with malloc, and tcmalloc uses strncmp within its implementation.)
Also revert dependent commit "[libFuzzer] Disable implicit builtin knowledge about memcmp-like functions when -fsanitize=fuzzer-no-link is given."
This reverts commit f78d9fceea and 12d1124c49.
Summary: This patch disables implicit builtin knowledge about memcmp-like functions when compiling the program for fuzzing, i.e., when -fsanitize=fuzzer(-no-link) is given. This allows libFuzzer to always intercept memcmp-like functions as it effectively disables optimizing calls to such functions into different forms. This is done by adding a set of flags (-fno-builtin-memcmp and others) in the clang driver. Individual -fno-builtin-* flags previously used in several libFuzzer tests are now removed, as it is now done automatically in the clang driver.
Reviewers: morehouse, hctim
Subscribers: cfe-commits, #sanitizers
Tags: #clang, #sanitizers
Differential Revision: https://reviews.llvm.org/D83987
Summary: libFuzzer intercepts certain library functions such as memcmp/strcmp by defining weak hooks. Weak hooks, however, are called only when other runtimes such as ASan is linked. This patch defines libFuzzer's own interceptors, which is linked into the libFuzzer executable when other runtimes are not linked, i.e., when -fsanitize=fuzzer is given, but not others.
Reviewers: kcc, morehouse, hctim
Reviewed By: morehouse, hctim
Subscribers: krytarowski, mgorny, cfe-commits, #sanitizers
Tags: #clang, #sanitizers
Differential Revision: https://reviews.llvm.org/D83494
Summary:
1. gcc uses `-march` and `-mtune` flag to chose arch and
pipeline model, but clang does not have `-mtune` flag,
we uses `-mcpu` to chose both infos.
2. Add SiFive e31 and u54 cpu which have default march
and pipeline model.
3. Specific `-mcpu` with rocket-rv[32|64] would select
pipeline model only, and use the driver's arch choosing
logic to get default arch.
Reviewers: lenary, asb, evandro, HsiangKai
Reviewed By: lenary, asb, evandro
Tags: #llvm, #clang
Differential Revision: https://reviews.llvm.org/D71124
Currently if two multi-letter extensions are provided in a -march=
string, the verification code checks the version of the first and
consumes the second, resulting in that part of the architecture
string being ignored. This adds a test that when a version number has
been parsed for an extension, there are no subsequent characters.
Differential Revision: https://reviews.llvm.org/D83819
Check that the implicit cast from `id` used to construct the element
variable in an ObjC for-in statement is valid.
This check is included as part of a new `objc-cast` sanitizer, outside
of the main 'undefined' group, as (IIUC) the behavior it's checking for
is not technically UB.
The check can be extended to cover other kinds of invalid casts in ObjC.
Partially addresses: rdar://12903059, rdar://9542496
Differential Revision: https://reviews.llvm.org/D71491
Summary:
Similar to what we have done downstream, some time ago:
https://svnweb.freebsd.org/changeset/base/353936
This followed some discussions on the freebsd-arch mailing lists, and
most people agreed that it was a better default, and also it worked
around several issues where clang generated libcalls to 64 bit atomic
primitives, instead of using cmpxchg8b.
Reviewers: emaste, brooks, rsmith
Reviewed By: emaste
Subscribers: arichardson, krytarowski, jfb, cfe-commits
Tags: #clang
Differential Revision: https://reviews.llvm.org/D83645
Do not detect device library by default in rocm detector.
Only detect device library in Rocm and HIP toolchain.
Separate detection of HIP runtime and Rocm device library.
Detect rocm path by version file in host toolchains.
Also added detecting rocm version and printing rocm
installation path and version with -v.
Fixed include path and device library detection for
ROCm 3.5.
Added --hip-version option. Renamed --hip-device-lib-path
to --rocm-device-lib-path.
Fixed default value for -fhip-new-launch-api.
Added default -std option for HIP.
Differential Revision: https://reviews.llvm.org/D82930
Summary:
-debug-info-kind=constructor reduces the amount of class debug info that
is emitted; this patch switches to using this as the default.
Constructor homing emits the complete type info for a class only when the
constructor is emitted, so it is expected that there will be some classes that
are not defined in the debug info anymore because they are never constructed,
and we shouldn't need debug info for these classes.
I compared the PDB files for clang, and there are 273 class types that are defined with `=limited`
but not with `=constructor` (out of ~60,000 total class types).
We've looked at a number of the types that are no longer defined with =constructor. The vast
majority of cases are something like class A is used as a parameter in a member function of
some other class B, which is emitted. But the function that uses class A is never called, and class A
is never constructed, and therefore isn't emitted in the debug info.
Bug: https://bugs.llvm.org/show_bug.cgi?id=46537
Subscribers: aprantl, cfe-commits, lldb-commits
Tags: #clang, #lldb
Differential Revision: https://reviews.llvm.org/D79147
The other backends don't know what this feature is and print a
message to stderr.
I recently tried to rework some target feature stuff in X86 and
this unknown feature tripped an assert I added.
Differential Revision: https://reviews.llvm.org/D83369
This patch creates a clang flag to enable SESES. This flag also ensures that
lvi-cfi is on when using seses via clang.
SESES should use lvi-cfi to mitigate returns and indirect branches.
The flag to enable the SESES functionality only without lvi-cfi is now
-x86-seses-enable-without-lvi-cfi to warn users part of the mitigation is not
enabled if they use this flag. This is useful in case folks want to see the
cost of SESES separate from the LVI-CFI.
Reviewed By: sconstab
Differential Revision: https://reviews.llvm.org/D79910
The Ubuntu system ld does not recognize the amdgcn-amd-amdhsa target.
Instead the host object with embedded device fat binary should not be
assembled by that triple. It should use default triple, so that the
object is compatible with system ld.
Reviewed By: yaxunl
Differential Revision: https://reviews.llvm.org/D83145
Making -g[no-]column-info opt out reduces the length of a typical CC1 command line.
Additionally, in a non-debug compile, we won't see -dwarf-column-info.
Summary:
Rename VE.cpp and VE.h to VEToolchain.cpp and VEToolchain.h respectively
in order to avoid link warning message. Linker warns that VE.cpp.o and
Arch/VE.cpp.o have the same name.
Reviewers: simoll, k-ishizaka
Reviewed By: simoll
Subscribers: mgorny, cfe-commits
Tags: #llvm, #ve, #clang
Differential Revision: https://reviews.llvm.org/D82968
Summary:
If you execute the following commandline multiple times, the behavior was not always the same:
clang++ --target=thumbv7em-none-windows-eabi-coff -march=armv7-m -mcpu=cortex-m7 -o temp.obj -c -x c++ empty.cpp
Most of the time the compilation succeeded, but sometimes clang reported this error:
clang++: error: the target architecture 'thumbv7em' is not supported by the target 'thumbv7em-none-windows-eabi'
The cause of the inconsistent behavior was the uninitialized variable Version.
With these commandline arguments, the variable Version was not set by getAsInteger(),
because it cannot parse a number from the substring "7em" (of "thumbv7em").
To get a consistent behaviour, it's enough to initialize the variable Version to zero.
Zero is smaller than 7, so the comparison will be true.
Then the command always fails with the error message seen above.
By using consumeInteger() instead of getAsInteger() we get 7 from the substring "7em"
and the command does not fail.
Reviewers: compnerd, danielkiss
Reviewed By: danielkiss
Subscribers: danielkiss, kristof.beyls, cfe-commits
Tags: #clang
Differential Revision: https://reviews.llvm.org/D75453
if it's newer than the target version
This change ensures that the arm64-apple-macOS slice is linked for
macOS 11 even if the deployment target is earlier than macOS 11.
specified at Command creation, rather than as part of the Tool.
This resolves the hack I just added to allow Darwin toolchain to vary
its level of support based on `-mlinker-version=`.
The change preserves the _current_ settings for response-file support.
Some tools look likely to be declaring that they don't support
response files in error, however I kept them as-is in order for this
change to be a simple refactoring.
Differential Revision: https://reviews.llvm.org/D82782
In XCode 12, ld64 got support for @files, in addition to the old
-filelist mechanism. Response files allow passing all command-line
arguments to the linker via a file, rather than just filenames, and is
therefore preferred.
Because of the way response-file support is currently implemented as
part of the Tool class in Clang, this change requires an ugly backdoor
function to access Args. A follow-up commit fixes this, but I've
ordered this change first, for easier backportability.
I've added no tests here, because unfortunately, there don't appear to
be _any_ response-file emission automated tests, and I don't see an
obvious way to add them. I've tested that this change works as
expected locally.
Differential Revision: https://reviews.llvm.org/D82777
This change ensures that the Darwin driver doesn't add unsupported libraries to the link
invocation when linking the Apple Silicon macOS slice.
rdar://61011136
Differential Revision: https://reviews.llvm.org/D82696
This fixes a unit test. Otherwise here is the original commit:
1) Shared writable directories like /tmp are a security problem.
2) Systems provide dedicated cache directories these days anyway.
3) This also refines LLVM's cache_directory() on Darwin platforms to use
the Darwin per-user cache directory.
Reviewers: compnerd, aprantl, jakehehrlich, espindola, respindola, ilya-biryukov, pcc, sammccall
Reviewed By: compnerd, sammccall
Subscribers: hiraditya, llvm-commits, cfe-commits
Tags: #clang, #llvm
Differential Revision: https://reviews.llvm.org/D82362
1) Shared writable directories like /tmp are a security problem.
2) Systems provide dedicated cache directories these days anyway.
3) This also refines LLVM's cache_directory() on Darwin platforms to use
the Darwin per-user cache directory.
Reviewers: compnerd, aprantl, jakehehrlich, espindola, respindola, ilya-biryukov, pcc, sammccall
Reviewed By: compnerd, sammccall
Subscribers: hiraditya, llvm-commits, cfe-commits
Tags: #clang, #llvm
Differential Revision: https://reviews.llvm.org/D82362
Summary:
Added support for dynamic memory allocation for globalized variables in
case if execution of target regions in parallel is required.
Reviewers: jdoerfert
Subscribers: jholewinski, yaxunl, guansong, sstefan1, cfe-commits, caomhin
Tags: #clang
Differential Revision: https://reviews.llvm.org/D82324
This reverts commit f570d58104.
The test was failing on MacOS if you set
LLVM_DEFAULT_TARGET_TRIPLE. For example if you set it to
"x86_64-apple-darwin" clang actually uses
"x86_64-apple-darwin<version>".
To fix this get default triple from clang itself during the
test instead of substituting it in via lit.
is running on an Apple Silicon mac
This change allows users to use `-arch arm64` to build for mac when
running it on Apple Silicon mac without explicit `-target` option.
Differential Revision: https://reviews.llvm.org/D82428
This reverts commit ede6005e70.
Ayke suggests this value varies chip-by-chip, and thus it is not safe to
hardcode to 0x800100.
Proper logic for this linker parameter will have to be wired up in a
follow up patch.
Add GNU Static Lib Tool, which supports the --emit-static-lib
flag. For HIP, a static library archive will be created and
consist of HIP Fat Binary host object with the device images embedded.
Using llvm-ar to create the static archive. Also, delete existing
output file to ensure a new archive is created each time.
Reviewers: yaxunl, tra, rjmccall, echristo
Subscribers: echristo, JonChesterfield, scchan, msearles
Differential Revision: https://reviews.llvm.org/D78759
This patch is a follow up on https://reviews.llvm.org/D78759.
Extract the HIP Linker script from generic GNU linker,
and move it into HIP ToolChain. Update OffloadActionBuilder
Link actions feature to apply device linking and host linking
actions separately. Using MC Directives, embed the device images
and define symbols.
Reviewers: JonChesterfield, yaxunl
Subscribers: tra, echristo, jdoerfert, msearles, scchan
Differential Revision: https://reviews.llvm.org/D81963
Summary:
As seen in:
https://bugs.llvm.org/show_bug.cgi?id=45693
When clang looks for a tool it has a set of
possible names for it, in priority order.
Previously it would look for these names in
the program path. Then look for all the names
in the PATH.
This means that aarch64-none-elf-gcc on the PATH
would lose to gcc in the program path.
(which was /usr/bin in the bug's case)
This changes that logic to search each name in both
possible locations, then move to the next name.
Which is more what you would expect to happen when
using a non default triple.
(-B prefixes maybe should follow this logic too,
but are not changed in this patch)
Subscribers: kristof.beyls, cfe-commits
Tags: #clang
Differential Revision: https://reviews.llvm.org/D79988
Add -fpch-instantiate-templates which makes template instantiations be
performed already in the PCH instead of it being done in every single
file that uses the PCH (but every single file will still do it as well
in order to handle its own instantiations). I can see 20-30% build
time saved with the few tests I've tried.
The change may reorder compiler output and also generated code, but
should be generally safe and produce functionally identical code.
There are some rare cases that do not compile with it,
such as test/PCH/pch-instantiate-templates-forward-decl.cpp. If
template instantiation bailed out instead of reporting the error,
these instantiations could even be postponed, which would make them
work.
Enable this by default for clang-cl. MSVC creates PCHs by compiling
them using an empty .cpp file, which means templates are instantiated
while building the PCH and so the .h needs to be self-contained,
making test/PCH/pch-instantiate-templates-forward-decl.cpp to fail
with MSVC anyway. So the option being enabled for clang-cl matches this.
Differential Revision: https://reviews.llvm.org/D69585
Keep deprecated -fsanitize-coverage-{white,black}list as aliases for compatibility for now.
Reviewed By: echristo
Differential Revision: https://reviews.llvm.org/D82244
On AIX, we use __atexit to register dtor functions rather than __cxa_atexit.
So a driver change is needed to default AIX to using -fno-use-cxa-atexit.
Windows platform does not uses __cxa_atexit either. Following its precedent,
we remove the assertion for when -fuse-cxa-atexit is specified by the user,
do not produce a message and silently default to -fno-use-cxa-atexit behavior.
Differential Revision: https://reviews.llvm.org/D82136
The accepted options to -mharden-sls= are:
* all: enable all mitigations against Straight Line Speculation that are
implemented.
* none: disable all mitigations against Straight Line Speculation.
* retbr: enable the mitigation against Straight Line Speculation for RET
and BR instructions.
* blr: enable the mitigation against Straight Line Speculation for BLR
instructions.
Differential Revision: https://reviews.llvm.org/D81404
Reviewers: dylanmckay
Reviewed By: dylanmckay
Subscribers: Jim, cfe-commits
Tags: #clang
Differential Revision: https://reviews.llvm.org/D77334
This was originally committed in
03b0831144 but I missed the commit
attribution.
Patch by Dennis van der Schagt.
Enable -amdgpu-internalize-symbols to eliminate unused functions and global variables
for whole program to speed up compilation and improve performance.
For -fno-gpu-rdc, -amdgpu-internalize-symbols is passed to clang -cc1.
For -fgpu-rdc, -amdgpu-internalize-symbols is passed to lld.
Differential Revision: https://reviews.llvm.org/D81959
Currently rocm detector expects device library bitcodes named as *.bc
instead of *.amdgcn.bc. However in rocm3.5 the device library bitcodes
are named as *.amdgcn.bc, which causes rocm3.5 not detected.
This patch fixes that.
Differential Revision: https://reviews.llvm.org/D81713
Summary:
The Android NDK's clang driver is used with an Android -target setting,
and the driver automatically finds the Android sysroot at a path
relative to the driver. The sysroot has the libc++ headers in it.
Remove Hurd::computeSysRoot as it is equivalent to the new
ToolChain::computeSysRoot method.
Fixes PR46213.
Reviewers: srhines, danalbert, #libc, kristina
Reviewed By: srhines, danalbert
Subscribers: ldionne, sthibaul, asb, rbar, johnrusso, simoncook, sabuasal, niosHD, jrtc27, MaskRay, zzheng, edward-jones, rogfer01, MartinMosbeck, brucehoult, the_o, PkmX, jocewei, Jim, lenary, s.egerton, pzheng, sameer.abuasal, apazos, luismarques, cfe-commits
Tags: #clang
Differential Revision: https://reviews.llvm.org/D81622
Summary:
Add a flag to omit the xray_fn_idx to cut size overhead and relocations
roughly in half at the cost of reduced performance for single function
patching. Minor additions to compiler-rt support per-function patching
without the index.
Reviewers: dberris, MaskRay, johnislarry
Subscribers: hiraditya, arphaman, cfe-commits, #sanitizers, llvm-commits
Tags: #clang, #sanitizers, #llvm
Differential Revision: https://reviews.llvm.org/D81995
The msvcrt library isn't a pure import library; it does contain
regular object files with wrappers/fallbacks, and these can require
linking against kernel32.
This only makes a difference when linking with ld.bfd, as lld
always searches all static libraries.
This matches a similar change made recently in gcc in
https://gcc.gnu.org/git/?p=gcc.git;a=commitdiff;h=850533ab160ef40eccfd039e1e3b138cf26e76b8,
although clang adds --start-group --end-group around these libraries
if -static is specified, which gcc doesn't. But try to match gcc's
linking order in any case, for consistency.
Differential Revision: https://reviews.llvm.org/D80880
Summary:
We're trying to use the --config options to pass distro specific
options for Fedora via the CFLAGS variable. However, some projects
end up using the CFLAGS variable multiple times in their command line,
which leads to an error when --config is used.
This patch resolves this issue by allowing more than one --config option
on the command line as long as the file names are the same.
Reviewers: sepavloff, hfinkel
Reviewed By: sepavloff
Subscribers: cfe-commits, llvm-commits
Tags: #clang
Differential Revision: https://reviews.llvm.org/D81424
This patch is a follow up on https://reviews.llvm.org/D81627.
In addition to default -fno-gpu-rdc case, this patches let
HIP toolchain not use llvm-link/opt/llc to link device code
for -fgpu-rdc case. Instead, uses standard lto.
This will eliminate some redundant optimizations and speed
up the compilation/linking.
Differential Revision: https://reviews.llvm.org/D81861
Currently HIP toolchain calls clang to emit bitcode then calls opt/llc for device compilation for the default -fno-gpu-rdc
case, which is unnecessary since clang is able to compile a single source file to ISA.
This patch fixes the HIP action builder and toolchain so that the default -fno-gpu-rdc can be done like a canonical
toolchain, i.e. one clang -cc1 invocation to compile source code to ISA.
This can avoid unnecessary processes to speed up the compilation, and avoid redundant LLVM passes which are
performed in clang -cc1 and opt.
Differential Revision: https://reviews.llvm.org/D81627
It's useful for using clang from tools that may need need to provide SDK files
from non-standard locations.
Clang CLI only provides a way to specify VFS for include files, so there's no
good way to test this yet.
Differential Revision: https://reviews.llvm.org/D81771
Summary:
- In HIP, just as the regular device-only compilation, the device-only
relocatable code compilation should not involve offload bundle.
- In addition, that device-only relocatable code compilation should have
the similar 3 steps, namely preprocessor, compile, and backend, to the
regular code generation with `-emit-llvm`.
Reviewers: yaxunl, tra
Subscribers: cfe-commits
Tags: #clang
Differential Revision: https://reviews.llvm.org/D81427
Summary:
ROCm.h had been getting the declarations for various data structures
by being #included next to them, rather than #includeing them itself.
This change fixes that by explicitly including the appropriate headers.
Subscribers: cfe-commits
Tags: #clang
Differential Revision: https://reviews.llvm.org/D81432
Summary:
Add -ftrivial-auto-var-init-stop-after= to limit the number of times
stack variables are initialized when -ftrivial-auto-var-init= is used to
initialize stack variables to zero or a pattern. This flag can be used
to bisect uninitialized uses of a stack variable exposed by automatic
variable initialization, such as http://crrev.com/c/2020401.
Reviewers: jfb, vitalybuka, kcc, glider, rsmith, rjmccall, pcc, eugenis, vlad.tsyrklevich
Reviewed By: jfb
Subscribers: phosek, hubert.reinterpretcast, srhines, MaskRay, george.burgess.iv, dexonsmith, inglorion, gbiv, llozano, manojgupta, cfe-commits
Tags: #clang
Differential Revision: https://reviews.llvm.org/D77168
Summary: This patch removes the special handling for Darwin on PowerPC in the default target cpu handling, because Darwin is no longer supported on the PowerPC platform.
Reviewers: hubert.reinterpretcast, daltenty
Reviewed By: hubert.reinterpretcast
Subscribers: wuzish, nemanjai, shchenz, steven.zhang, cfe-commits
Tags: #clang
Differential Revision: https://reviews.llvm.org/D81115
To support std::complex and some other standard C/C++ functions in HIP device code,
they need to be forced to be __host__ __device__ functions by pragmas. This is done
by some clang standard C++ wrapper headers which are shared between cuda-clang and hip-Clang.
For these standard C++ wapper headers to work properly, specific include path order
has to be enforced:
clang C++ wrapper include path
standard C++ include path
clang include path
Also, these C++ wrapper headers require device version of some standard C/C++ functions
must be declared before including them. This needs to be done by including a default
header which declares or defines these device functions. The default header is always
included before any other headers are included by users.
This patch adds the the default header and include path for HIP.
Differential Revision: https://reviews.llvm.org/D81176
Follow the model used on Linux, where the clang driver passes the
linker a -u switch to force the profile runtime to be linked in,
rather than having every TU emit a dead function with a reference.
Differential Revision: https://reviews.llvm.org/D79835
Follow the model used on Linux, where the clang driver passes the
linker a -u switch to force the profile runtime to be linked in,
rather than having every TU emit a dead function with a reference.
Patch By: mcgrathr
Differential Revision: https://reviews.llvm.org/D79835
Summary: This patch changes the AIX default target CPU to power4 since this is the the lowest arch for the lowest OS level supported.
Reviewers: hubert.reinterpretcast, cebowleratibm, daltenty
Reviewed By: hubert.reinterpretcast
Subscribers: cfe-commits
Tags: #clang
Differential Revision: https://reviews.llvm.org/D80835
Summary:
An upgrade of LLVM for CrOS [0] containing [1] triggered a bunch of
errors related to writing to reserved registers for a Linux kernel's
arm64 compat vdso (which is a aarch32 image).
After a discussion on LKML [2], it was determined that
-f{no-}omit-frame-pointer was not being specified. Comparing GCC and
Clang [3], it becomes apparent that GCC defaults to omitting the frame
pointer implicitly when optimizations are enabled, and Clang does not.
ie. setting -O1 (or above) implies -fomit-frame-pointer. Clang was
defaulting to -fno-omit-frame-pointer implicitly unless -fomit-frame-pointer
was set explicitly.
Why this becomes a problem is that the Linux kernel's arm64 compat vdso
contains code that uses r7. r7 is used sometimes for the frame pointer
(for example, when targeting thumb (-mthumb)). See useR7AsFramePointer()
in llvm/llvm-project/llvm/lib/Target/ARM/ARMSubtarget.h. This is mostly
for legacy/compatibility reasons, and the 2019 Q4 revision of the ARM
AAPCS looks to standardize r11 as the frame pointer for aarch32, though
this is not yet implemented in LLVM.
Users that are reliant on the implicit value if unspecified when
optimizations are enabled should explicitly choose -fomit-frame-pointer
(new behavior) or -fno-omit-frame-pointer (old behavior).
[0] https://bugs.chromium.org/p/chromium/issues/detail?id=1084372
[1] https://reviews.llvm.org/D76848
[2] https://lore.kernel.org/lkml/20200526173117.155339-1-ndesaulniers@google.com/
[3] https://godbolt.org/z/0oY39t
Reviewers: kristof.beyls, psmith, danalbert, srhines, MaskRay, ostannard, efriedma
Reviewed By: psmith, danalbert, srhines, MaskRay, efriedma
Subscribers: efriedma, olista01, MaskRay, vhscampos, cfe-commits, llvm-commits, manojgupta, llozano, glider, hctim, eugenis, pcc, peter.smith, srhines
Tags: #clang, #llvm
Differential Revision: https://reviews.llvm.org/D80828
This patch adds clang options:
-fbasic-block-sections={all,<filename>,labels,none} and
-funique-basic-block-section-names.
LLVM Support for basic block sections is already enabled.
+ -fbasic-block-sections={all, <file>, labels, none} : Enables/Disables basic
block sections for all or a subset of basic blocks. "labels" only enables
basic block symbols.
+ -funique-basic-block-section-names: Enables unique section names for
basic block sections, disabled by default.
Differential Revision: https://reviews.llvm.org/D68049
These are mapped in MachO::getMachOArchName already, but were missing
in ToolChain::getDefaultUniversalArchName.
Having these reverse mapped here fixes weird inconsistencies like
-dumpmachine showing a target triple like "aarch64-apple-darwin",
while "clang -target aarch64-apple-darwin" didn't use to work (ended
up mapped as unknown-apple-ios).
Differential Revision: https://reviews.llvm.org/D79117
Summary: Before this patch, we use two different ways to pass options to align branch
depending on whether LTO is enabled. For example, `-mbranches-within-32B-boundaries`
w/o LTO and `-Wl,-plugin-opt=-x86-branches-within-32B-boundaries` w/ LTO. It's
inconvenient, so this patch unifies the way: we only need to pass options like
`-mbranches-within-32B-boundaries` to align branches, no matter LTO is enabled or not.
Differential Revision: https://reviews.llvm.org/D80289
Summary:
This patch simply adds support for the new CPU in anticipation of
Power10. There isn't really any functionality added so there are no
associated test cases at this time.
Reviewers: stefanp, nemanjai, amyk, hfinkel, power-llvm-team, #powerpc
Reviewed By: stefanp, nemanjai, amyk, #powerpc
Subscribers: NeHuang, steven.zhang, hiraditya, llvm-commits, wuzish, shchenz, cfe-commits, kbarton, echristo
Tags: #clang, #powerpc, #llvm
Differential Revision: https://reviews.llvm.org/D80020
Summary:
This patch simply adds support for the new CPU in anticipation of
Power10. There isn't really any functionality added so there are no
associated test cases at this time.
Reviewers: stefanp, nemanjai, amyk, hfinkel, power-llvm-team, #powerpc
Reviewed By: stefanp, nemanjai, amyk, #powerpc
Subscribers: NeHuang, steven.zhang, hiraditya, llvm-commits, wuzish, shchenz, cfe-commits, kbarton, echristo
Tags: #clang, #powerpc, #llvm
Differential Revision: https://reviews.llvm.org/D80020
-fno-semantic-interposition is currently the CC1 default. (The opposite
disables some interprocedural optimizations.) However, it does not infer
dso_local: on most targets accesses to ExternalLinkage functions/variables
defined in the current module still need PLT/GOT.
This patch makes explicit -fno-semantic-interposition infer dso_local,
so that PLT/GOT can be eliminated if targets implement local aliases
for AsmPrinter::getSymbolPreferLocal (currently only x86).
Currently we check whether the module flag "SemanticInterposition" is 0.
If yes, infer dso_local. In the future, we can infer dso_local unless
"SemanticInterposition" is 1: frontends other than clang will also
benefit from the optimization if they don't bother setting the flag.
(There will be risks if they do want ELF interposition: they need to set
"SemanticInterposition" to 1.)
Summary: On AIX, add '-bcdtors:all:0:s' to the linker implicitly through the driver so that we can collect all static constructor and destructor functions.
Reviewers: hubert.reinterpretcast, Xiangling_L, ZarkoCA, daltenty
Reviewed By: hubert.reinterpretcast
Subscribers: cfe-commits
Tags: #clang
Differential Revision: https://reviews.llvm.org/D80415
The various HIP builds are all inconsistent.
The default llvm install goes to ${INSTALL_PREFIX}/bin/clang, but the
rocm packaging scripts move this under
${INSTALL_PREFIX}/llvm/bin/clang. Some other builds further pollute
this with ${INSTALL_PREFIX}/bin/x86_64/clang. These should really be
consolidated, but try to handle them for now.
This is required to get avr-gdb correctly showing values at the right
addresses. This problem was discovered by using debug symbols in an
external program to lookup values in an AVR simulator.
Enables Machine Outlining for ARM and Thumb2 modes. This is the first
patch of the series which adds all the basic logic for the support, and
only handles tail-calls and thunks.
The outliner can be turned on by using clang -moutline option or -mllvm
-enable-machine-outliner one (like AArch64).
Differential Revision: https://reviews.llvm.org/D76066
Commit 73152a2ec2 fixed type checking for
blocks with qualified id parameters. But there are existing APIs in
Apple SDKs relying on the old type checking behavior. Specifically,
these are APIs using NSItemProviderCompletionHandler in
Foundation/NSItemProvider.h. To keep existing code working and to allow
developers to use affected APIs introduce a compatibility mode that
enables the previous and the fixed type checking. This mode is enabled
only on Darwin platforms.
Reviewed By: jyknight, ahatanak
Differential Revision: https://reviews.llvm.org/D79511
Fixes PR42445 (compiler driver options -Os -Oz translate to
-plugin-opt=Os (Oz) which are not recognized by LLVMgold.so or LLD).
The optimization level mapping matches
CompilerInvocation.cpp:getOptimizationLevel() and SpeedLevel of
PassBuilder::OptimizationLevel::O*.
-plugin-opt=O* affects the way we construct regular LTO/ThinLTO pass
manager pipeline.
Reviewed By: pcc
Differential Revision: https://reviews.llvm.org/D79919
-nogpulib makes sense when there is a host (where -nostdlib would
apply) and offload target. Accept nostdlib when there is no offload
target as an alias.
Merge with the new --rocm-path handling used for OpenCL. This looks
for a usable set of device libraries upfront, rather than giving a
generic "no such file or directory error". If any of the required
bitcode libraries are missing, this will now produce a "cannot find
ROCm installation." error. This differs from the existing hip specific
flags by pointing to a rocm root install instead of a single directory
with bitcode files.
This tries to maintain compatibility with the existing the
--hip-device-lib and --hip-device-lib-path flags, as well as the
HIP_DEVICE_LIB_PATH environment variable, or at least the range of
uses with testcases. The existing range of uses and behavior doesn't
entirely make sense to me, so some of the untested edge cases change
behavior. Currently the two path forms seem to have the double purpose
of a search path for an arbitrary --hip-device-lib, and for finding
the stock set of libraries. Since the stock set of libraries This also
changes the behavior when multiple paths are specified, and only takes
the last one (and the environment variable only handles a single
path).
If --hip-device-lib is used, it now only treats --hip-device-lib-path
as the search path for it, and does not attempt to find the rocm
installation. If not, --hip-device-lib-path and the environment
variable are used as the directory to search instead of the rocm root
based path.
This should also automatically fix handling of the options to use
wave64.
SLH doesn't support asm goto and is unlikely to ever support it. Users of asm
goto need a way to choose whether to use asm goto or fallback to an SLH
compatible code path when SLH is enabled. This feature flag will give users
this ability.
Tested via unit test
Reviewed By: mattdr
Differential Revision: https://reviews.llvm.org/D79733
Adds a new data structure, ImmutableGraph, and uses RDF to find LVI gadgets and add them to a MachineGadgetGraph.
More specifically, a new X86 machine pass finds Load Value Injection (LVI) gadgets consisting of a load from memory (i.e., SOURCE), and any operation that may transmit the value loaded from memory over a covert channel, or use the value loaded from memory to determine a branch/call target (i.e., SINK).
Also adds a new target feature to X86: +lvi-load-hardening
The feature can be added via the clang CLI using -mlvi-hardening.
Differential Revision: https://reviews.llvm.org/D75936
This patch adds a matrix type to Clang as described in the draft
specification in clang/docs/MatrixSupport.rst. It introduces a new option
-fenable-matrix, which can be used to enable the matrix support.
The patch adds new MatrixType and DependentSizedMatrixType types along
with the plumbing required. Loads of and stores to pointers to matrix
values are lowered to memory operations on 1-D IR arrays. After loading,
the loaded values are cast to a vector. This ensures matrix values use
the alignment of the element type, instead of LLVM's large vector
alignment.
The operators and builtins described in the draft spec will will be added in
follow-up patches.
Reviewers: martong, rsmith, Bigcheese, anemet, dexonsmith, rjmccall, aaron.ballman
Reviewed By: rjmccall
Differential Revision: https://reviews.llvm.org/D72281
Currently all Fuchsia ABIs use a 4k page size, departing from
the recommended page sizes in the respective psABI documents.
Differential Revision: https://reviews.llvm.org/D79667
The compact format is fully supported on Fuchsia and is the
preferred default.
Patch By: mcgrathr
Differential Revision: https://reviews.llvm.org/D79665
Summary:
`AsmPrinter::emitGlobalIndirectSymbol` is dependent on
`MCStreamer::emitAssignment` to produce `.set` directives for alias
symbols; however, the `.set` pseudo-op on AIX is documented as not
usable with external relocatable terms or expressions, which limits its
applicability in generating alias symbols.
Disable generating aliases on AIX until a different implementation
strategy is available.
Reviewers: cebowleratibm, jasonliu, sfertile, daltenty, DiggerLin
Reviewed By: jasonliu
Differential Revision: https://reviews.llvm.org/D79044
This is a standalone patch and this would help Propeller do a better job of code
layout as it can accurately attribute the profiles to the right internal linkage
function.
This also helps SampledFDO/AutoFDO correctly associate sampled profiles to the
right internal function. Currently, if there is more than one internal symbol
foo, their profiles are aggregated by SampledFDO.
This patch adds a new clang option, -funique-internal-funcnames, to generate
unique names for functions with internal linkage. This patch appends the md5
hash of the module name to the function symbol as a best effort to generate a
unique name for symbols with internal linkage.
Differential Revision: https://reviews.llvm.org/D73307
Summary:
MemTag does not have any runtime at the moment, it's strictly code
instrumentation.
Reviewers: pcc
Subscribers: cryptoad, cfe-commits
Tags: #clang
Differential Revision: https://reviews.llvm.org/D79522
Summary:
When forking in several threads, the counters were written out in using the same global static variables (see GCDAProfiling.c): that leads to crashes.
So when there is a fork, the counters are resetted in the child process and they will be dumped at exit using the interprocess file locking.
When there is an exec, the counters are written out and in case of failures they're resetted.
Reviewers: jfb, vsk, marco-c, serge-sans-paille
Reviewed By: marco-c, serge-sans-paille
Subscribers: llvm-commits, serge-sans-paille, dmajor, cfe-commits, hiraditya, dexonsmith, #sanitizers, marco-c, sylvestre.ledru
Tags: #sanitizers, #clang, #llvm
Differential Revision: https://reviews.llvm.org/D78477
We need a way to know supported targets by clang since
people may use clang as assembler and they want to
choose the clang which supports their target.
This patch let clang print registered targets when
--version option is passed to clang.
Differential Revision: https://reviews.llvm.org/D79210
Summary: The current code for GNU/Linux is actually completely generic, and can be moved to Gnu, so it can benefit GNU/Hurd and GNU/kFreeBSD
Reviewers: kristina, sammccall, lebedev.ri, MaskRay, arsenm, phosek
Reviewed By: MaskRay, phosek
Subscribers: wdng, ormris, emaste, arichardson, krytarowski, cfe-commits
Tags: #clang
Differential Revision: https://reviews.llvm.org/D73845
This is to avoid checking for the validity of a file that is not used.
This also contains a minor fix for the test, as the cfi sanitizer requires -flto and -fvisibility= arguments.
Differential Revision: https://reviews.llvm.org/D79043
This matches what is done for MSVC in
b8000c0ce8. Since that commit, compiler
rt sanitizer libraries aren't linked to with absolute path on windows,
but using their basenames, requiring the libdirs to be passed to
the linker.
This fixes undefined behaviour sanitizer on MinGW after
b8000c0ce8.
Differential Revision: https://reviews.llvm.org/D79076
Prior to this change, for a few compiler-rt libraries such as ubsan and
the profile library, Clang would embed "-defaultlib:path/to/rt-arch.lib"
into the .drective section of every object compiled with
-finstr-profile-generate or -fsanitize=ubsan as appropriate.
These paths assume that the link step will run from the same working
directory as the compile step. There is also evidence that sometimes the
paths become absolute, such as when clang is run from a different drive
letter from the current working directory. This is fragile, and I'd like
to get away from having paths embedded in the object if possible. Long
ago it was suggested that we use this for ASan, and apparently I felt
the same way back then:
https://reviews.llvm.org/D4428#56536
This is also consistent with how all other autolinking usage works for
PS4, Mac, and Windows: they all use basenames, not paths.
To keep things working for people using the standard GCC driver
workflow, the driver now adds the resource directory to the linker
library search path when it calls the linker. This is enough to make
check-ubsan pass, and seems like a generally good thing.
Users that invoke the linker directly (most clang-cl users) will have to
add clang's resource library directory to their linker search path in
their build system. I'm not sure where I can document this. Ideally I'd
also do it in the MSBuild files, but I can't figure out where they go.
I'd like to start with this for now.
Reviewed By: hans
Differential Revision: https://reviews.llvm.org/D65543
Add support for reserving LR in:
* the driver through `-ffixed-x30`
* cc1 through `-target-feature +reserve-x30`
* the backend through `-mattr=+reserve-x30`
* a subtarget feature `reserve-x30`
the same way we're doing for the other registers.
The current code for GNU/Linux is actually completely generic, and can be moved to ToolChains/Gnu.cpp,
so that it can benefit GNU/Hurd and GNU/kFreeBSD.
Reviewed By: MaskRay, phosek
Differential Revision: https://reviews.llvm.org/D73845
This patch upstreams support for the Armv8.6-a Matrix Multiplication
Extension. A summary of the features can be found here:
https://community.arm.com/developer/ip-products/processors/b/processors-ip-blog/posts/arm-architecture-developments-armv8-6-a
This patch includes:
- Command line options to enable these features with +i8mm, +f32mm, or f64mm
Note: +f32mm and +f64mm are optional and so are not enabled by default
This is part of a patch series, starting with BFloat16 support and
the other components in the armv8.6a extension (in previous patches
linked in phabricator)
Based on work by:
- Luke Geeson
- Oliver Stannard
- Luke Cheeseman
Reviewers: t.p.northover, DavidSpickett
Reviewed By: DavidSpickett
Subscribers: DavidSpickett, ostannard, kristof.beyls, danielkiss,
cfe-commits
Tags: #clang
Differential Revision: https://reviews.llvm.org/D77875
Summary:
Change the default ABI to be compatible with GCC. For 32-bit ELF
targets other than Linux, Clang now returns small structs in registers
r3/r4. This affects FreeBSD, NetBSD, OpenBSD. There is no change for
32-bit Linux, where Clang continues to return all structs in memory.
Add clang options -maix-struct-return (to return structs in memory) and
-msvr4-struct-return (to return structs in registers) to be compatible
with gcc. These options are only for PPC32; reject them on PPC64 and
other targets. The options are like -fpcc-struct-return and
-freg-struct-return for X86_32, and use similar code.
To actually return a struct in registers, coerce it to an integer of the
same size. LLVM may optimize the code to remove unnecessary accesses to
memory, and will return i32 in r3 or i64 in r3:r4.
Fixes PR#40736
Patch by George Koehler!
Reviewed By: jhibbits, nemanjai
Differential Revision: https://reviews.llvm.org/D73290
Summary:
This flag has been deprecated, with an on-by-default warning encouraging
users to explicitly specify whether they mean "all" or ubsan for 5 years
(released in Clang 3.7). Change it to mean what we wanted and
undeprecate it.
Also make the argument to -fsanitize-trap optional, and likewise default
it to 'all', and express the aliases for these flags in the .td file
rather than in code. (Plus documentation updates for the above.)
Reviewers: kcc
Subscribers: cfe-commits
Tags: #clang
Differential Revision: https://reviews.llvm.org/D77753
Since the default logic was based on having fast denormal/fma
features, and the default target has no features, we assumed flushing
by default. This fixes incorrectly assuming flushing in builds for
"generic" IR libraries.
The handling for no specified --cuda-gpu-arch in HIP is kind of
broken. Somewhere else forces a default target of gfx803, which does
not enable denormal handling by default. We don't see this default
switching here, so you'll end up with a different denormal mode
depending on whether you explicitly requested gfx803, or used it by
default.
I didn't realize HIP was a distinct offloading kind, so the subtarget
was looking for -march, which isn't correct for HIP. We also have the
possibility of different denormal defaults in the case of multiple
offload targets, so we need to thread the JobAction through the target
hook.
Summary:
This commit adds two command-line options to clang.
These options let the user decide which functions will receive SanitizerCoverage instrumentation.
This is most useful in the libFuzzer use case, where it enables targeted coverage-guided fuzzing.
Patch by Yannis Juglaret of DGA-MI, Rennes, France
libFuzzer tests its target against an evolving corpus, and relies on SanitizerCoverage instrumentation to collect the code coverage information that drives corpus evolution. Currently, libFuzzer collects such information for all functions of the target under test, and adds to the corpus every mutated sample that finds a new code coverage path in any function of the target. We propose instead to let the user specify which functions' code coverage information is relevant for building the upcoming fuzzing campaign's corpus. To this end, we add two new command line options for clang, enabling targeted coverage-guided fuzzing with libFuzzer. We see targeted coverage guided fuzzing as a simple way to leverage libFuzzer for big targets with thousands of functions or multiple dependencies. We publish this patch as work from DGA-MI of Rennes, France, with proper authorization from the hierarchy.
Targeted coverage-guided fuzzing can accelerate bug finding for two reasons. First, the compiler will avoid costly instrumentation for non-relevant functions, accelerating fuzzer execution for each call to any of these functions. Second, the built fuzzer will produce and use a more accurate corpus, because it will not keep the samples that find new coverage paths in non-relevant functions.
The two new command line options are `-fsanitize-coverage-whitelist` and `-fsanitize-coverage-blacklist`. They accept files in the same format as the existing `-fsanitize-blacklist` option <https://clang.llvm.org/docs/SanitizerSpecialCaseList.html#format>. The new options influence SanitizerCoverage so that it will only instrument a subset of the functions in the target. We explain these options in detail in `clang/docs/SanitizerCoverage.rst`.
Consider now the woff2 fuzzing example from the libFuzzer tutorial <https://github.com/google/fuzzer-test-suite/blob/master/tutorial/libFuzzerTutorial.md>. We are aware that we cannot conclude much from this example because mutating compressed data is generally a bad idea, but let us use it anyway as an illustration for its simplicity. Let us use an empty blacklist together with one of the three following whitelists:
```
# (a)
src:*
fun:*
# (b)
src:SRC/*
fun:*
# (c)
src:SRC/src/woff2_dec.cc
fun:*
```
Running the built fuzzers shows how many instrumentation points the compiler adds, the fuzzer will output //XXX PCs//. Whitelist (a) is the instrument-everything whitelist, it produces 11912 instrumentation points. Whitelist (b) focuses coverage to instrument woff2 source code only, ignoring the dependency code for brotli (de)compression; it produces 3984 instrumented instrumentation points. Whitelist (c) focuses coverage to only instrument functions in the main file that deals with WOFF2 to TTF conversion, resulting in 1056 instrumentation points.
For experimentation purposes, we ran each fuzzer approximately 100 times, single process, with the initial corpus provided in the tutorial. We let the fuzzer run until it either found the heap buffer overflow or went out of memory. On this simple example, whitelists (b) and (c) found the heap buffer overflow more reliably and 5x faster than whitelist (a). The average execution times when finding the heap buffer overflow were as follows: (a) 904 s, (b) 156 s, and (c) 176 s.
We explain these results by the fact that WOFF2 to TTF conversion calls the brotli decompression algorithm's functions, which are mostly irrelevant for finding bugs in WOFF2 font reconstruction but nevertheless instrumented and used by whitelist (a) to guide fuzzing. This results in longer execution time for these functions and a partially irrelevant corpus. Contrary to whitelist (a), whitelists (b) and (c) will execute brotli-related functions without instrumentation overhead, and ignore new code paths found in them. This results in faster bug finding for WOFF2 font reconstruction.
The results for whitelist (b) are similar to the ones for whitelist (c). Indeed, WOFF2 to TTF conversion calls functions that are mostly located in SRC/src/woff2_dec.cc. The 2892 extra instrumentation points allowed by whitelist (b) do not tamper with bug finding, even though they are mostly irrelevant, simply because most of these functions do not get called. We get a slightly faster average time for bug finding with whitelist (b), which might indicate that some of the extra instrumentation points are actually relevant, or might just be random noise.
Reviewers: kcc, morehouse, vitalybuka
Reviewed By: morehouse, vitalybuka
Subscribers: pratyai, vitalybuka, eternalsakura, xwlin222, dende, srhines, kubamracek, #sanitizers, lebedev.ri, hiraditya, cfe-commits, llvm-commits
Tags: #clang, #sanitizers, #llvm
Differential Revision: https://reviews.llvm.org/D63616
Currently the library is separately linked, but this isn't correct to
implement fast math flags correctly. Each module should get the
version of the library appropriate for its combination of fast math
and related flags, with the attributes propagated into its functions
and internalized.
HIP already maintains the list of libraries, but this is not used for
OpenCL. Unfortunately, HIP uses a separate --hip-device-lib argument,
despite both languages using the same bitcode library. Eventually
these two searches need to be merged.
An additional problem is there are 3 different locations the libraries
are installed, depending on which build is used. This also needs to be
consolidated (or at least the search logic needs to deal with this
unnecessary complexity).
Summary:
* accept -x cu to indicate language is CUDA
* transfer CUDA language flag to header-file arguments
Differential Revision: https://reviews.llvm.org/D77451
This adds support for enabling experimental/unratified RISC-V ISA
extensions in the -march string in the case where an explicit version
number has been declared, and the -menable-experimental-extensions flag
has been provided.
This follows the design as discussed on the mailing lists in the
following RFC: http://lists.llvm.org/pipermail/llvm-dev/2020-January/138364.html
Since the RISC-V toolchain definition currently rejects any extension
with an explicit version number, the parsing logic has been tweaked to
support this, and to allow standard extensions to have their versions
checked in future patches.
The bitmanip 'b' extension has been added as a first use of this support,
it should easily extend to other as yet unratified extensions (such as
the vector 'v' extension).
Differential Revision: https://reviews.llvm.org/D73891
Summary:
The option `-mpad-max-prefix-size` performs some checking and delegate to MC option `-x86-pad-max-prefix-size`. This option is designed for eliminate NOPs when we need to align something by adding redundant prefixes to instructions, e.g. it can be used along with `-malign-branch`, `-malign-branch-boundary` to prefix padding branch.
It has similar (but slightly different) effect as GAS's option `-malign-branch-prefix-size`, e.g. `-mpad-max-prefix-size` can also elminate NOPs emitted by align directive, so we use a different name here. I remove the option `-malign-branch-prefix-size` since is unimplemented and not needed. If we need to be compatible with GAS, we can make `-malign-branch-prefix-size` an alias for this option later.
Reviewers: jyknight, reames, MaskRay, craig.topper, LuoYuanke
Reviewed By: MaskRay, LuoYuanke
Subscribers: annita.zhang, cfe-commits
Tags: #clang
Differential Revision: https://reviews.llvm.org/D77628
Generate PTX using newer versions of PTX and allow using sm_80 with CUDA-11.
None of the new features of CUDA-10.2+ have been implemented yet, so using these
versions will still produce a warning.
Differential Revision: https://reviews.llvm.org/D77670
Instead of hardcoding individual GPU mappings in multiple functions, keep them
all in one table and use it to look up the mappings.
We also don't care about 'virtual' architecture much, so the API is trimmed down
down to a simpler GPU->Virtual arch name lookup.
Differential Revision: https://reviews.llvm.org/D77665
For OpenMP target regions to piggy back on the CUDA/AMDGPU/... implementation of math functions,
we include the appropriate definitions inside of an `omp begin/end declare variant match(device={arch(nvptx)})` scope.
This way, the vendor specific math functions will become specialized versions of the system math functions.
When a system math function is called and specialized version is available the selection logic introduced in D75779
instead call the specialized version. In contrast to the code path we used so far, the system header is actually included.
This means functions without specialized versions are available and so are macro definitions.
This should address PR42061, PR42798, and PR42799.
Reviewed By: ye-luo
Differential Revision: https://reviews.llvm.org/D75788
Update the sysroot expectation to match other targets and breakout
linux/musl toolchain tests into a new file.
Differential Revision: https://reviews.llvm.org/D77440
Summary:
- Use `device_builtin_surface` and `device_builtin_texture` for
surface/texture reference support. So far, both the host and device
use the same reference type, which could be revised later when
interface/implementation is stablized.
Reviewers: yaxunl
Subscribers: cfe-commits
Tags: #clang
Differential Revision: https://reviews.llvm.org/D77583
clang with -flto does not handle -foptimization-record-path=<path>
This dulicates the code from ToolChains/Clang.cpp with modifications to
support everything in the same fashion.
Adds a new data structure, ImmutableGraph, and uses RDF to find LVI gadgets and add them to a MachineGadgetGraph.
More specifically, a new X86 machine pass finds Load Value Injection (LVI) gadgets consisting of a load from memory (i.e., SOURCE), and any operation that may transmit the value loaded from memory over a covert channel, or use the value loaded from memory to determine a branch/call target (i.e., SINK).
Also adds a new target feature to X86: +lvi-load-hardening
The feature can be added via the clang CLI using -mlvi-hardening.
Differential Revision: https://reviews.llvm.org/D75936
This pass replaces each indirect call/jump with a direct call to a thunk that looks like:
lfence
jmpq *%r11
This ensures that if the value in register %r11 was loaded from memory, then
the value in %r11 is (architecturally) correct prior to the jump.
Also adds a new target feature to X86: +lvi-cfi
("cfi" meaning control-flow integrity)
The feature can be added via clang CLI using -mlvi-cfi.
This is an alternate implementation to https://reviews.llvm.org/D75934 That merges the thunk insertion functionality with the existing X86 retpoline code.
Differential Revision: https://reviews.llvm.org/D76812
This wasn't respecting the flush mode based on the default, and also
wasn't correctly handling the explicit
-fno-cuda-flush-denormals-to-zero overriding the mode.
Prior to this change the clang interface stubs format resembled
something ending with a symbol list like this:
Symbols:
a: { Type: Func }
This was problematic because we didn't actually want a map format and
also because we didn't like that an empty symbol list required
"Symbols: {}". That is to say without the empty {} llvm-ifs would crash
on an empty list.
With this new format it is much more clear which field is the symbol
name, and instead the [] that is used to express an empty symbol vector
is optional, ie:
Symbols:
- { Name: a, Type: Func }
or
Symbols: []
or
Symbols:
This further diverges the format from existing llvm-elftapi. This is a
good thing because although the format originally came from the same
place, they are not the same in any way.
Differential Revision: https://reviews.llvm.org/D76979
The driver enables -fdiagnostics-show-option by default, so flip the CC1
default to reduce the lengths of common CC1 command lines.
This change also makes ParseDiagnosticArgs() consistently enable
-fdiagnostics-show-option by default.
Apparently HIPToolChain does not subclass from AMDGPUToolChain, so
this was not applying the new denormal attributes. I'm not sure why
this doesn't subclass. Just copy the implementation for now.
Since GlobalISel is maturing and is already on at -O0 for AArch64, it's not
completely "experimental". Create a more appropriate driver flag and make
the older option an alias for it.
Differential Revision: https://reviews.llvm.org/D77103
On Ubuntu, we want to raise default CLANG_SYSTEMZ_ARCH to z13,
thus allow configuring this via CMake.
On Debian, we want to raise it to z196.
Author: Dimitri John Ledkov
Differential Revision: https://reviews.llvm.org/D75914
Currently -fno-unroll-loops is ignored when doing LTO on Darwin. This
patch adds a new -lto-no-unroll-loops option to the LTO code generator
and forwards it to the linker if -fno-unroll-loops is passed.
Reviewers: thegameg, steven_wu
Reviewed By: thegameg
Differential Revision: https://reviews.llvm.org/D76916
Before this patch, it wasn't possible to extend the ThinLTO threads to all SMT/CMT threads in the system. Only one thread per core was allowed, instructed by usage of llvm::heavyweight_hardware_concurrency() in the ThinLTO code. Any number passed to the LLD flag /opt:lldltojobs=..., or any other ThinLTO-specific flag, was previously interpreted in the context of llvm::heavyweight_hardware_concurrency(), which means SMT disabled.
One can now say in LLD:
/opt:lldltojobs=0 -- Use one std::thread / hardware core in the system (no SMT). Default value if flag not specified.
/opt:lldltojobs=N -- Limit usage to N threads, regardless of usage of heavyweight_hardware_concurrency().
/opt:lldltojobs=all -- Use all hardware threads in the system. Equivalent to /opt:lldltojobs=$(nproc) on Linux and /opt:lldltojobs=%NUMBER_OF_PROCESSORS% on Windows. When an affinity mask is set for the process, threads will be created only for the cores selected by the mask.
When N > number-of-hardware-threads-in-the-system, the threads in the thread pool will be dispatched equally on all CPU sockets (tested only on Windows).
When N <= number-of-hardware-threads-on-a-CPU-socket, the threads will remain on the CPU socket where the process started (only on Windows).
Differential Revision: https://reviews.llvm.org/D75153
In this case we interpret the path as relative the clang driver binary.
This allows SDKs to be built that include clang along with a custom
sysroot without requiring users to specify --sysroot to point to the
directory where they installed the SDK.
See https://github.com/WebAssembly/wasi-sdk/issues/58
Differential Revision: https://reviews.llvm.org/D76653
When Clang crashes a useful message is output:
"PLEASE submit a bug report to https://bugs.llvm.org/ and include the
crash backtrace, preprocessed source, and associated run script."
A similar message is now output for all tools.
Differential Revision: https://reviews.llvm.org/D74324
The argument after -Xarch_device will be added to the arguments for CUDA/HIP
device compilation and will be removed for host compilation.
The argument after -Xarch_host will be added to the arguments for CUDA/HIP
host compilation and will be removed for device compilation.
Differential Revision: https://reviews.llvm.org/D76520
Extract common code to a function. To prepare for
adding an option for CUDA/HIP host and device only
option.
Differential Revision: https://reviews.llvm.org/D76455
The ".sdk" component is usually the last one in the -isysroot, so it
makes more sense to scan from the back. Also, technically, someone
could install Xcode into a directory ending with .sdk, which would
break this heuristic.
Differential Revision: https://reviews.llvm.org/D76097
Passing small data limit to RISCVELFTargetObjectFile by module flag,
So the backend can set small data section threshold by the value.
The data will be put into the small data section if the data smaller than
the threshold.
Differential Revision: https://reviews.llvm.org/D57497
HIPToolChain::TranslateArgs call TranslateArgs of host toolchain with
the input args to get a list of derived args called DAL, then
go through the input args by itself and append them to DAL.
This assumes that the host toolchain should not append any unchanged
args to DAL, otherwise there will be duplicates since
HIPToolChain will append it again.
This works for GNU toolchain since it returns an empty list for DAL.
However, MSVC toolchain will append unchanged args to DAL, which
causes duplicate args.
This patch let MSVC toolchain not append unchanged args for HIP
offloading kind, which fixes this issue.
Differential Revision: https://reviews.llvm.org/D76032
This flag is used by avr-gcc (starting with v10) to set the width of the
double type. The double type is by default interpreted as a 32-bit
floating point number in avr-gcc instead of a 64-bit floating point
number as is common on other architectures. Starting with GCC 10, a new
option has been added to control this behavior:
https://gcc.gnu.org/wiki/avr-gcc#Deviations_from_the_Standard
This commit keeps the default double at 32 bits but adds support for the
-mdouble flag (-mdouble=32 and -mdouble=64) to control this behavior.
Differential Revision: https://reviews.llvm.org/D76181
- libclang_rt.profile should be added when -fcs-profile-generate is on thecommand line.
- OPT_fno_profile_instr_generate was used as a negative for OPT_fprofile_generate. Fix it to use OPT_fno_profile_generate.
Differential Revision: https://reviews.llvm.org/D75274
This fixes an issue with clang issuing a warning about unknown CUDA SDK if it's
detected during non-CUDA compilation.
Differential Revision: https://reviews.llvm.org/D76030
Device-side compilation does not support some features and we need to
filter them out when command line options enable them for the host.
We're already doing this in various places in the regular clang driver,
but clang-cl mode constructs cc1 options independently and needs to
implement the filtering, too.
Differential Revision: https://reviews.llvm.org/D75310
After a first attempt to fix the test-suite failures, my first recommit
caused the same failures again. I had updated CMakeList.txt files of
tests that needed -fcommon, but it turns out that there are also
Makefiles which are used by some bots, so I've updated these Makefiles
now too.
See the original commit message for more details on this change:
0a9fc9233e
This includes fixes for:
- test-suite: some benchmarks need to be compiled with -fcommon, see D75557.
- compiler-rt: one test needed -fcommon, and another a change, see D75520.
Summary:
User can select the version of SYCL the compiler will
use via the flag -sycl-std, similar to -cl-std.
The flag defines the LangOpts.SYCLVersion option to the
version of SYCL. The default value is undefined.
If driver is building SYCL code, flag is set to the default SYCL
version (1.2.1)
The preprocessor uses this variable to define CL_SYCL_LANGUAGE_VERSION macro,
which should be defined according to SYCL 1.2.1 standard.
Only valid value at this point for the flag is 1.2.1.
Co-Authored-By: David Wood <Q0KPU0H1YOEPHRY1R2SN5B5RL@david.davidtw.co>
Signed-off-by: Ruyman Reyes <ruyman@codeplay.com>
Subscribers: ebevhan, Anastasia, cfe-commits
Tags: #clang
Differential Revision: https://reviews.llvm.org/D72857
This reverts commit 737394c490.
The fp-model test was failing on platforms that enable denormal flushing
based on -ffast-math. This needs to reset to IEEE, not the default in
these cases.
Change-Id: Ibbad32f66d0d0b89b9c1173a3a96fb1a570ddd89
The IR hasn't switched the default yet, so explicitly add the ieee
attributes.
I'm still not really sure how the target default denormal mode should
interact with -fno-unsafe-math-optimizations. The target may have
selected the default mode to be non-IEEE based on the flags or based
on its true behavior, but we don't know which is the case. Since the
only users of a non-IEEE mode without a flag still support IEEE mode,
just reset to IEEE.