Summary:
CUDA 9's minimum sm is sm_30.
Ideally we should also make sm_30 the default when compiling with CUDA
9, but that seems harder than it should be.
Subscribers: sanjoy
Differential Revision: https://reviews.llvm.org/D39109
llvm-svn: 316611
As indicated by Table 1-1 in Intel Architecture Instruction Set Extensions and Future Features Programming Reference from October 2017.
llvm-svn: 316593
NetBSD uses `long int` for `intptr_t` on ARM. This was changed in SVN
r316046, referenced against other compilers. However, NetBSD's
reference was incorrect as the current clang behaviour is more
up-to-date. Restore the original behaviour for that target.
llvm-svn: 316204
This patch has the following changes
A new flag "-mhvx-length={64B|128B}" is introduced to specify the length of the vector.
Previously we have used "-mhvx-double" for 128 Bytes. This adds the target-feature "+hvx-length{64|128}b"
The "-mhvx" flag must be provided on command line to enable HVX for Hexagon. If no -mhvx-length flag
is specified, a default length is picked from the arch mentioned in this priority order from either -mhvx=vxx
or -mcpu. For v60 and v62 the default length is 64 Byte. For unknown versions, the length is 128 Byte. The
-mhvx flag adds the target-feature "+hvxv{hvx_version}"
The 64 Byte mode is soon going to be deprecated. A warning is emitted if 64 Byte is enabled. A warning is
still emitted for the default 64 Byte as well. This warning can be suppressed with a -Wno flag.
The "-mhvx-double" and "-mno-hvx-double" flags are deprecated. A warning is emitted if the driver sees
them on commandline. "-mhvx-double" is an alias to "-mhvx-length=128B"
The compilation will error out if -mhvx-length is specified with out an -mhvx/-mhvx= flag
The macro HVX_LENGTH is defined and is set to the length of the vector.
Eg: #define HVX_LENGTH 64
The macro HVX_ARCH is defined and is set to the version of the HVX.
Eg: #define HVX_ARCH 62
Differential Revision: https://reviews.llvm.org/D38852
llvm-svn: 316102
Darwin and OpenBSD are the only platforms which use `long int` for
`__INTPTR_TYPE__`. The other platforms use `int` in 32-bit, and `long
int` on 64-bit (except for VMS and Windows which are LLP64). Adjust the
type definitions to match the platform definitions. We now generate the
same definition as GCC on all the targets.
llvm-svn: 316046
This commit allows the refactoring library to use its own set of
refactoring-specific diagnostics to reports things like initiation errors.
Differential Revision: https://reviews.llvm.org/D38772
llvm-svn: 315924
Summary:
Convert clang::LangAS to a strongly typed enum
Currently both clang AST address spaces and target specific address spaces
are represented as unsigned which can lead to subtle errors if the wrong
type is passed. It is especially confusing in the CodeGen files as it is
not possible to see what kind of address space should be passed to a
function without looking at the implementation.
I originally made this change for our LLVM fork for the CHERI architecture
where we make extensive use of address spaces to differentiate between
capabilities and pointers. When merging the upstream changes I usually
run into some test failures or runtime crashes because the wrong kind of
address space is passed to a function. By converting the LangAS enum to a
C++11 we can catch these errors at compile time. Additionally, it is now
obvious from the function signature which kind of address space it expects.
I found the following errors while writing this patch:
- ItaniumRecordLayoutBuilder::LayoutField was passing a clang AST address
space to TargetInfo::getPointer{Width,Align}()
- TypePrinter::printAttributedAfter() prints the numeric value of the
clang AST address space instead of the target address space.
However, this code is not used so I kept the current behaviour
- initializeForBlockHeader() in CGBlocks.cpp was passing
LangAS::opencl_generic to TargetInfo::getPointer{Width,Align}()
- CodeGenFunction::EmitBlockLiteral() was passing a AST address space to
TargetInfo::getPointerWidth()
- CGOpenMPRuntimeNVPTX::translateParameter() passed a target address space
to Qualifiers::addAddressSpace()
- CGOpenMPRuntimeNVPTX::getParameterAddress() was using
llvm::Type::getPointerTo() with a AST address space
- clang_getAddressSpace() returns either a LangAS or a target address
space. As this is exposed to C I have kept the current behaviour and
added a comment stating that it is probably not correct.
Other than this the patch should not cause any functional changes.
Reviewers: yaxunl, pcc, bader
Reviewed By: yaxunl, bader
Subscribers: jlebar, jholewinski, nhaehnle, Anastasia, cfe-commits
Differential Revision: https://reviews.llvm.org/D38816
llvm-svn: 315871
Currently Clang uses default address space (0) to represent private address space for OpenCL
in AST. There are two issues with this:
Multiple address spaces including private address space cannot be diagnosed.
There is no mangling for default address space. For example, if private int* is emitted as
i32 addrspace(5)* in IR. It is supposed to be mangled as PUAS5i but it is mangled as
Pi instead.
This patch attempts to represent OpenCL private address space explicitly in AST. It adds
a new enum LangAS::opencl_private and adds it to the variable types which are implicitly
private:
automatic variables without address space qualifier
function parameter
pointee type without address space qualifier (OpenCL 1.2 and below)
Differential Revision: https://reviews.llvm.org/D35082
llvm-svn: 315668
This allows clients to avoid an unnecessary fs::status() call on each
directory entry. Because the information returned by FindFirstFileEx
is a subset of the information returned by a regular status() call,
I needed to extract a base class from file_status that contains only
that information.
On my machine, this reduces the time required to enumerate a ThinLTO
cache directory containing 520k files from almost 4 minutes to less
than 2 seconds.
Differential Revision: https://reviews.llvm.org/D38716
llvm-svn: 315378
Move the logic for determining the `wchar_t` type information into the
driver. Rather than passing the single bit of information of
`-fshort-wchar` indicate to the frontend the desired type of `wchar_t`
through a new `-cc1` option of `-fwchar-type` and indicate the
signedness through `-f{,no-}signed-wchar`. This replicates the current
logic which was spread throughout Basic into the
`RenderCharacterOptions`.
Most of the changes to the tests are to ensure that the frontend uses
the correct type. Add a new test set under `test/Driver/wchar_t.c` to
ensure that we calculate the proper types for the various cases.
llvm-svn: 315126
https://reviews.llvm.org/D38371
This patch implements codegen for the combined 'teams distribute" OpenMP pragma and adds regression tests for all its clauses.
llvm-svn: 314905
Summary:
Also:
- Add support for some older Myriad CPUs that were missing.
- Fix some incorrect compiler defines for exisitng CPUs.
Reviewers: jyknight
Subscribers: fedor.sergeev
Differential Revision: https://reviews.llvm.org/D37551
llvm-svn: 314706
Currently AMDGPU inline asm only allow v and s as register names in constraints.
This patch allows the following register names in constraints: (n, m is unsigned integer, n < m)
v
s
{vn} or {v[n]}
{sn} or {s[n]}
{S} , where S is a special register name
{v[n:m]}
{s[n:m]}
Differential Revision: https://reviews.llvm.org/D37568
llvm-svn: 314452
Summary:
This is the follow-up patch to D37924.
This change refactors clang to use the the newly added section headers
in SpecialCaseList to specify which sanitizers blacklists entries
should apply to, like so:
[cfi-vcall]
fun:*bad_vcall*
[cfi-derived-cast|cfi-unrelated-cast]
fun:*bad_cast*
The SanitizerSpecialCaseList class has been added to allow querying by
SanitizerMask, and SanitizerBlacklist and its downstream users have been
updated to provide that information. Old blacklists not using sections
will continue to function identically since the blacklist entries will
be placed into a '[*]' section by default matching against all
sanitizers.
Reviewers: pcc, kcc, eugenis, vsk
Reviewed By: eugenis
Subscribers: dberris, cfe-commits, mgorny
Differential Revision: https://reviews.llvm.org/D37925
llvm-svn: 314171
This is to fix PR31620. MaxAtomicInlineWidth is set to 128 for x86_64. However
for target without cx16 support, 128 atomic operation will generate __sync_*
libcalls. The patch set MaxAtomicInlineWidth to 64 if the target doesn't support
cx16.
Differential Revision: https://reviews.llvm.org/D38046
llvm-svn: 313992
This patch introduces a class that can help to build tools that require cross
translation unit facilities. This class allows function definitions to be loaded
from external AST files based on an index. In order to use this functionality an
index is required. The index format is a flat text file but it might be
replaced with a different solution in the near future. USRs are used as names to
look up the functions definitions. This class also does caching to avoid
redundant loading of AST files.
Right now only function defnitions can be loaded using this API because this is
what the in progress cross translation unit feature of the Static Analyzer
requires. In to future this might be extended to classes, types etc.
Differential Revision: https://reviews.llvm.org/D34512
llvm-svn: 313975
This arranges more of the Intel and AMD CPUs into fallthrough positions based on their features. We may be able to merge this new AMD set with the BTVER or BDVER sets but I didn't look that closely.
Differential Revision: https://reviews.llvm.org/D37941
llvm-svn: 313497
Summary:
For a lot of older CPUs we have a 1:1 mapping between CPU name and enum name. But many of them are effectively aliases of each other and as a result are always repeated together at every usage
This patch removes most of the duplication. It also uses StringSwitch::Cases to make the many to one mapping in the StringSwitch more obvious.
Reviewers: RKSimon, spatel, zvi, igorb
Reviewed By: RKSimon
Subscribers: cfe-commits
Differential Revision: https://reviews.llvm.org/D37938
llvm-svn: 313462
Introduce a new "export_as" directive for top-level modules, which
indicates that the current module is a "private" module whose symbols
will eventually be exported through the named "public" module. This is
in support of a common pattern in the Darwin ecosystem where a single
public framework is constructed of several private frameworks, with
(currently) header duplication and some support from the linker.
Addresses rdar://problem/34438420.
llvm-svn: 313316
Summary:
The find_first_existing_file and find_first_existing_vc_file macros in
lib/Basic/CMakeLists.txt are removed. The macros are also defined in
{LLVM}/cmake/modules/AddLLVM.cmake for the same purpose. This change
serves the following 2 objectives:
1. To remove the redundant code in clang to use the same
macros in llvm,
2. The macros in AddLLVM.cmake can also handle repo for
displaying correct version information.
Reviewers: jordan_rose, cfe-commits, modocache, hintonda
Reviewed By: hintonda
Subscribers: mgorny
Differential Revision: https://reviews.llvm.org/D35533
llvm-svn: 312865
This fixes a possible crash on certain kinds of corrupted AST file, but
checking in an AST file corrupted in just the right way will be a maintenance
nightmare because the format changes frequently.
llvm-svn: 312851
For now CUDA-9 is not included in the list of CUDA versions clang
searches for, so the path to CUDA-9 must be explicitly passed
via --cuda-path=.
On LLVM side NVPTX added sm_70 GPU type which bumps required
PTX version to 6.0, but otherwise is equivalent to sm_62 at the moment.
Differential Revision: https://reviews.llvm.org/D37576
llvm-svn: 312734
Diagnostic Categories are fairly annoying, and are only enforced
by a runtime-debug-only assert. This puts in a touch more work
to get this all done at compile-time with static asserts
Differential Revision: https://reviews.llvm.org/D37122
llvm-svn: 311905
Clang's DiagnosticError is an llvm::Error payload that stores a partial
diagnostic and its location. I'll be using it in the refactoring engine.
Differential Revision: https://reviews.llvm.org/D36969
llvm-svn: 311778
In patch r205628 using abs.[ds] instruction is forced, as they should behave
in accordance with flags Has2008 and ABS2008. Unfortunately for revisions
prior mips32r6 and mips64r6, abs.[ds] is not generating correct result when
working with NaNs. To generate a sequence which always produce a correct
result but also to allow user more control on how his code is compiled,
option -mabs is added where user can choose legacy or 2008.
By default legacy mode is used on revisions prior R6. Mips32r6 and mips64r6
use abs2008 mode by default.
Patch by Aleksandar Beserminji
Differential Revision: https://reviews.llvm.org/D35982
llvm-svn: 311669
This patch is intended to enable the use of basic double letter constraints used in GCC extended inline asm {Yi Y2 Yz Y0 Ym Yt}.
Supersedes D35205
llvm counterpart: D36369
Differential Revision: https://reviews.llvm.org/D36371
llvm-svn: 311643
Rename the function getSupportedNanEncoding() to getIEEE754Standard(), since
this function will be used for non-nan related features.
Patch by Aleksandar Beserminji.
Differential Revision: https://reviews.llvm.org/D36824
llvm-svn: 311454
Generalize getOpenCLImageAddrSpace into getOpenCLTypeAddrSpace, such
that targets can select the address space per type.
No functional changes intended.
Initial patch by Simon Perretta.
Differential Revision: https://reviews.llvm.org/D33989
llvm-svn: 310911
This patch adds support for __builtin_cpu_is. I've tried to match the strings supported to the latest version of gcc.
Differential Revision: https://reviews.llvm.org/D35449
llvm-svn: 310657
They still need to be implemented in the intrinsics, the command line, and the backend. But this change isn't dependent on any of that and resolves a TODO.
llvm-svn: 310386
This is similar to what's done on arm and x86_64, where
these calling conventions are silently ignored, as in
SVN r245076.
Differential Revision: https://reviews.llvm.org/D36105
llvm-svn: 310303
OpenCL 2.0 atomic builtin functions have a scope argument which is ideally
represented as synchronization scope argument in LLVM atomic instructions.
Clang supports translating Clang atomic builtin functions to LLVM atomic
instructions. However it currently does not support synchronization scope
of LLVM atomic instructions. Without this, users have to use LLVM assembly
code to implement OpenCL atomic builtin functions.
This patch adds OpenCL 2.0 atomic builtin functions as Clang builtin
functions, which supports generating LLVM atomic instructions with
synchronization scope operand.
Currently only constant memory scope argument is supported. Support of
non-constant memory scope argument will be added later.
Differential Revision: https://reviews.llvm.org/D28691
llvm-svn: 310082