Clang's internal build system for implicit modules uses lock files to
ensure that after a process writes a PCM it will read the same one back
in (without contention from other -cc1 commands). Since PCMs are read
from disk repeatedly while invalidating, building, and importing, the
lock is not released quickly. Furthermore, the LockFileManager is not
robust in every environment. Other -cc1 commands can stall until
timeout (after about eight minutes).
This commit changes the lock file from being necessary for correctness
to a (possibly dubious) performance hack. The remaining benefit is to
reduce duplicate work in competing -cc1 commands which depend on the
same module. Follow-up commits will change the internal build system to
continue after a timeout, and reduce the timeout. Perhaps we should
reconsider blocking at all.
This also fixes a use-after-free, when one part of a compilation
validates a PCM and starts using it, and another tries to swap out the
PCM for something new.
The PCMCache is a new type called MemoryBufferCache, which saves memory
buffers based on their filename. Its ownership is shared by the
CompilerInstance and ModuleManager.
- The ModuleManager stores PCMs there that it loads from disk, never
touching the disk if the cache is hot.
- When modules fail to validate, they're removed from the cache.
- When a CompilerInstance is spawned to build a new module, each
already-loaded PCM is assumed to be valid, and is frozen to avoid
the use-after-free.
- Any newly-built module is written directly to the cache to avoid the
round-trip to the filesystem, making lock files unnecessary for
correctness.
Original patch by Manman Ren; most testcases by Adrian Prantl!
llvm-svn: 298165
Modified the tests to accept any iteration order, to run only on Unix, and added
additional error reporting to investigate SystemZ bot issue.
The VFS directory iterator and recursive directory iterator behave differently
from the LLVM counterparts. Once the VFS iterators hit a broken symlink they
immediately abort. The LLVM counterparts don't stat entries unless they have to
descend into the next directory, which allows to recover from this issue by
clearing the error code and skipping to the next entry.
This change adds similar behavior to the VFS iterators. There should be no
change in current behavior in the current CLANG source base, because all
clients have loop exit conditions that also check the error code.
This fixes rdar://problem/30934619.
Differential Revision: https://reviews.llvm.org/D30768
llvm-svn: 297693
We can't actually pretend that 0 is valid for address space 0.
r295877 added a workaround to stop allocating user objects
there, so we can use 0 as the invalid pointer.
Some of the tests seemed to be using private as the non-0 null
test address space, so add copies using local to make sure
this is still stressed.
llvm-svn: 297659
Change ASTFileSignature from a random 32-bit number to the hash of the
PCM content.
- Move definition ASTFileSignature to Basic/Module.h so Module and
ASTSourceDescriptor can use it.
- Change the signature from uint64_t to std::array<uint32_t,5>.
- Stop using (saving/reading) the size and modification time of PCM
files when there is a valid SIGNATURE.
- Add UNHASHED_CONTROL_BLOCK, and use it to store the SIGNATURE record
and other records that shouldn't affect the hash. Because implicit
modules reuses the same file for multiple levels of -Werror, this
includes DIAGNOSTIC_OPTIONS and DIAG_PRAGMA_MAPPINGS.
This helps to solve a PCH + implicit Modules dependency issue: PCH files
are handled by the external build system, whereas implicit modules are
handled by internal compiler build system. This prevents invalidating a
PCH when the compiler overwrites a PCM file with the same content
(modulo the diagnostic differences).
Design and original patch by Manman Ren!
llvm-svn: 297655
Modified the tests to accept any iteration order.
The VFS directory iterator and recursive directory iterator behave differently
from the LLVM counterparts. Once the VFS iterators hit a broken symlink they
immediately abort. The LLVM counterparts allow to recover from this issue by
clearing the error code and skipping to the next entry.
This change adds the same functionality to the VFS iterators. There should be
no change in current behavior in the current CLANG source base, because all
clients have loop exit conditions that also check the error code.
This fixes rdar://problem/30934619.
Differential Revision: https://reviews.llvm.org/D30768
llvm-svn: 297528
The VFS directory iterator and recursive directory iterator behave differently
from the LLVM counterparts. Once the VFS iterators hit a broken symlink they
immediately abort. The LLVM counterparts allow to recover from this issue by
clearing the error code and skipping to the next entry.
This change adds the same functionality to the VFS iterators. There should be
no change in current behavior in the current CLANG source base, because all
clients have loop exit conditions that also check the error code.
This fixes rdar://problem/30934619.
Differential Revision: https://reviews.llvm.org/D30768
llvm-svn: 297510
that return record or vector types
The performSelector family of methods from Foundation use objc_msgSend to
dispatch the selector invocations to objects. However, method calls to methods
that return record types might have to use the objc_msgSend_stret as the return
value won't find into the register. This is also supported by this sentence from
performSelector documentation: "The method should not have a significant return
value and should take a single argument of type id, or no arguments". This
commit adds a new warning that warns when a selector which corresponds to a
method that returns a record type is passed into performSelector.
rdar://12056271
Differential Revision: https://reviews.llvm.org/D30174
llvm-svn: 297019
Summary:
Historically, NetBSD, FreeBSD and OpenBSD have defined the macro ABICALLS in
the preprocessor when -mabicalls is in effect.
Mainline GCC later defined __mips_abicalls when -mabicalls is in effect.
This patch teaches the preprocessor to define these macros when appropriate.
NetBSD does not require the ABICALLS macro.
This resolves PR/31694.
Thanks to Sean Bruno for highlighting this issue!
Reviewers: slthakur, seanbruno
Reviewed By: seanbruno
Subscribers: joerg, brad, emaste, seanbruno, cfe-commits
Differential Revision: https://reviews.llvm.org/D29032
llvm-svn: 295728
https://reviews.llvm.org/D29922
This patch adds two fields for use in the implementation of 'distribute parallel for':
The increment expression for the distribute loop. As the chunk assigned to a team is executed by multiple threads within the 'parallel for' region, the increment expression has to correspond to the value returned by the related runtime call (for_static_init).
The upper bound of the innermost loop ('for' in 'distribute parallel for') is not the globalUB expression normally used for pragma 'for' when found in isolation. It is instead the upper bound of the chunk assigned to the team ('distribute' loop). In this way, we prevent teams from executing chunks assigned to other teams.
The use of these two fields can be see in a related explanatory patch:
https://reviews.llvm.org/D29508
llvm-svn: 295497
Summary:
The -mmcu option for GCC sets macros like __AVR_ATmega328P__ (with the trailing
underscores), be sure to include these underscores for Clangs -mcpu option.
See "AVR Built-in Macros" in https://gcc.gnu.org/onlinedocs/gcc/AVR-Options.html
Reviewers: jroelofs, dylanmckay
Reviewed By: jroelofs, dylanmckay
Subscribers: efriedma, cfe-commits
Differential Revision: https://reviews.llvm.org/D29817
llvm-svn: 294869
until we can get better TargetMachine::isCompatibleDataLayout to compare - otherwise
we can't code generate existing bitcode without a string equality data layout.
This reverts commit r294703.
llvm-svn: 294708
For other platforms we should find out what they need and likely
make the same change, however, a smaller additional change is easier
for platforms we know have it specified in the ABI.
clang support for r294702
llvm-svn: 294703
This is a followup change to add v7ve support to clang for gcc
compatibility. Please see r294661.
Patch by Manoj Gupta.
Differential Revision: https://reviews.llvm.org/D29773
llvm-svn: 294662
1. Adds the command line flag for clzero.
2. Includes the clzero flag under znver1.
3. Defines the macro for clzero.
4. Adds a new file which has the intrinsic definition for clzero instruction.
Patch by Ganesh Gopalasubramanian with some additional tests from me.
Differential revision: https://reviews.llvm.org/D29386
llvm-svn: 294559
This feature flag indicates that the processor has support for removing certain instructions from user mode software. But the feature flag by itself doesn't indicate if the support is enabled in the OS. The affected instructions aren't even instructions the compiler would emit. So I don't think think this feature flag should be in the compiler.
llvm-svn: 294414
GCC 7 will predefine two new macros on s390x:
- __ARCH__ indicates the ISA architecture level
- __VX__ indicates that the vector facility is available
This adds those macros to clang as well to ensure continued
compatibility with GCC.
llvm-svn: 294197
Summary:
This tells clang about all of the different AVR microcontrollers.
It also adds code to define the correct preprocessor macros for each
device.
Reviewers: jroelofs, asl
Reviewed By: asl
Subscribers: asl, cfe-commits
Differential Revision: https://reviews.llvm.org/D28346
llvm-svn: 294177
Summary:
Previously the method would simply return false, causing every single
inline assembly constraint to trigger a compile error.
This adds inline assembly constraint support for the AVR target.
This patch is derived from the code in
AVRISelLowering::getConstraintType.
More details can be found on the AVR-GCC reference wiki
http://www.nongnu.org/avr-libc/user-manual/inline_asm.html
Reviewers: jroelofs, asl
Reviewed By: asl
Subscribers: asl, ahatanak, saaadhu, cfe-commits
Differential Revision: https://reviews.llvm.org/D28344
llvm-svn: 294176
GCC does not generate `__unix` nor `unix` macros. The latter already
intrudes into the user's namespace and should be avoided. Use the
canonical spelling of `__unix__` across all the targets.
llvm-svn: 294148
First pass at generating weak definitions of inline functions from module files
(& skipping (-O0) or emitting available_externally (optimizations)
definitions where those modules are used).
External functions defined in modules are emitted into the modular
object file as well (this may turn an existing ODR violation (if that
module were imported into multiple translations) into valid/linkable
code).
Internal symbols (static functions, for example) are not correctly
supported yet. The symbol will be produced, internal, in the modular
object - unreferenceable from the users.
Reviewers: rsmith
Differential Revision: https://reviews.llvm.org/D28845
llvm-svn: 293456
Rather than storing a single flat list of SourceLocations where the diagnostic
state changes (in source order), we now store a separate list for each FileID
in which there is a diagnostic state transition. (State for other files is
built and cached lazily, on demand.) This has two consequences:
1) We can now sensibly support modules, and properly track the diagnostic state
for modular headers (this matters when, for instance, triggering instantiation
of a template defined within a module triggers diagnostics).
2) It's much faster than the old approach, since we can now just do a binary
search on the offsets within the FileID rather than needing to call
isBeforeInTranslationUnit to determine source order (which is surprisingly
slow). For some pathological (but real world) files, this reduces total
compilation time by more than 10%.
For now, the diagnostic state points for modules are loaded eagerly. It seems
feasible to defer this until diagnostic state information for one of the
module's files is needed, but that's not part of this patch.
llvm-svn: 293123
This patch adds support for codegen of 'target teams' on the host.
This combined directive has two captured statements, one for the
'teams' region, and the other for the 'parallel'.
This target teams region is offloaded using the __tgt_target_teams()
call. The patch sets the number of teams as an argument to
this call.
Reviewers: ABataev
Differential Revision: https://reviews.llvm.org/D29084
llvm-svn: 293005
This patch adds support for codegen of 'target teams' on the host.
This combined directive has two captured statements, one for the
'teams' region, and the other for the 'parallel'.
This target teams region is offloaded using the __tgt_target_teams()
call. The patch sets the number of teams as an argument to
this call.
Reviewers: ABataev
Differential Revision: https://reviews.llvm.org/D29084
llvm-svn: 293001
This patch adds support for codegen of 'target parallel' on the host.
It is also the first combined directive that requires two or more
captured statements. Support for this functionality is included in
the patch.
A combined directive such as 'target parallel' has two captured
statements, one for the 'target' and the other for the 'parallel'
region. Two captured statements are required because each has
different implicit parameters (see SemaOpenMP.cpp). For example,
the 'parallel' has 'global_tid' and 'bound_tid' while the 'target'
does not. The patch adds support for handling multiple captured
statements based on the combined directive.
When codegen'ing the 'target parallel' directive, the 'target'
outlined function is created using the outer captured statement
and the 'parallel' outlined function is created using the inner
captured statement.
Reviewers: ABataev
Differential Revision: https://reviews.llvm.org/D28753
llvm-svn: 292419
This is just wasted space, we don't support state points from multiple
source managers. Validate that there's no state when resetting the
source manager and use the 'global' reference to the sourcemanager
instead of the ones in the diag state.
llvm-svn: 292402
This patch adds support for codegen of 'target parallel' on the host.
It is also the first combined directive that requires two or more
captured statements. Support for this functionality is included in
the patch.
A combined directive such as 'target parallel' has two captured
statements, one for the 'target' and the other for the 'parallel'
region. Two captured statements are required because each has
different implicit parameters (see SemaOpenMP.cpp). For example,
the 'parallel' has 'global_tid' and 'bound_tid' while the 'target'
does not. The patch adds support for handling multiple captured
statements based on the combined directive.
When codegen'ing the 'target parallel' directive, the 'target'
outlined function is created using the outer captured statement
and the 'parallel' outlined function is created using the inner
captured statement.
Reviewers: ABataev
Differential Revision: https://reviews.llvm.org/D28753
llvm-svn: 292374
This patch is to implement sema and parsing for 'target teams distribute simd’ pragma.
Differential Revision: https://reviews.llvm.org/D28252
llvm-svn: 291579
Summary:
This patch enables the following
1. AMD family 17h architecture using "znver1" tune flag (-march, -mcpu).
2. ISAs that are enabled for "znver1" architecture.
3. Checks ADX isa from cpuid to identify "znver1" flag when -march=native is used.
4. ISAs FMA4, XOP are disabled as they are dropped from amdfam17.
5. For the time being, it uses the btver2 scheduler model.
6. Test file is updated to check this flag.
This is linked to llvm review item https://reviews.llvm.org/D28017
Patch by Ganesh Gopalasubramanian. Additional test cases added by Craig Topper.
Reviewers: RKSimon, craig.topper
Subscribers: cfe-commits, RKSimon, ashutosh.nema, llvm-commits
Differential Revision: https://reviews.llvm.org/D28018
llvm-svn: 291544
Summary:
When compiling device code, we may still see host code with explicit
calling conventions. NVPTX needs to claim that it supports these CCs,
so that (a) we don't raise noisy warnings, and (b) we don't break
existing code which relies on the existence of these CCs when
specializing templates. (If a CC doesn't exist, clang ignores it, so
two template specializations which are different only insofar as one
specifies a CC are considered identical and therefore are an error if
that CC is not supported.)
Reviewers: tra
Subscribers: cfe-commits
Differential Revision: https://reviews.llvm.org/D28323
llvm-svn: 291136
Summary:
CUDA lets users share structs between the host and device, so for that
and other reasons, primitive types such as ptrdiff_t should be the same
on both sides of the compilation.
Our code to do this wasn't entirely successful. In particular, we did a
bunch of work during the NVPTXTargetInfo constructor, only to override
it in the NVPTX{32,64}TargetInfo constructors. It worked well enough on
Linux and Mac, but Windows is LLP64, which is different enough to break
it.
This patch removes the NVPTX{32,64}TargetInfo classes entirely and fixes
the bug described above.
Reviewers: tra
Subscribers: cfe-commits
Differential Revision: https://reviews.llvm.org/D28322
llvm-svn: 291135
Summary:
Authored by Senthil Kumar Selvaraj
This patch adds barebones support in Clang for the (experimental) AVR target. It uses the integrated assembler for assembly, and the GNU linker for linking, as lld doesn't know about the target yet.
The DataLayout string is the same as the one in AVRTargetMachine.cpp. The alignment specs look wrong to me, as it's an 8 bit target and all types only need 8 bit alignment. Clang failed with a datalayout mismatch error when I tried to change it, so I left it that way for now.
Reviewers: rsmith, dylanmckay, cfe-commits, rengolin
Subscribers: rengolin, jroelofs, wdng
Differential Revision: https://reviews.llvm.org/D27123
llvm-svn: 291082
This patch is to implement sema and parsing for 'target teams distribute parallel for simd’ pragma.
Differential Revision: https://reviews.llvm.org/D28202
llvm-svn: 290862
This patch is to implement sema and parsing for 'target teams distribute parallel for’ pragma.
Differential Revision: https://reviews.llvm.org/D28160
llvm-svn: 290725
According to extended asm syntax, a case where the clobber list includes a variable from the inputs or outputs should be an error - conflict.
for example:
const long double a = 0.0;
int main()
{
char b;
double t1 = a;
__asm__ ("fucompp": "=a" (b) : "u" (t1), "t" (t1) : "cc", "st", "st(1)");
return 0;
}
This should conflict with the output - t1 which is st, and st which is st aswell.
The patch fixes it.
Commit on behald of Ziv Izhar.
Differential Revision: https://reviews.llvm.org/D15075
llvm-svn: 290539
This patch is to implement sema and parsing for 'target teams distribute' pragma.
Differential Revision: https://reviews.llvm.org/D28015
llvm-svn: 290508
diagnostics and fix one such diagnostic.
Sadly, this assert doesn't catch this bug because we have no tests that
emit this diagnostic! Doh! I'm following up on the commit that
introduces it to get that fixed. Then this assert will help in a more
direct way.
llvm-svn: 290417
Merge all VFS mapped files inside -ivfsoverlay inputs into the vfs
overlay provided by the crash reproducer. This is the last missing piece
to allow crash reproducers to fully work with user frameworks; when
combined with headermaps, it allows clang to find additional frameworks.
rdar://problem/27913709
llvm-svn: 290326
Added a map to associate types and declarations with extensions.
Refactored existing diagnostic for disabled types associated with extensions and extended it to declarations for generic situation.
Fixed some bugs for types associated with extensions.
Allow users to use pragma to declare types and functions for supported extensions, e.g.
#pragma OPENCL EXTENSION the_new_extension_name : begin
// declare types and functions associated with the extension here
#pragma OPENCL EXTENSION the_new_extension_name : end
Differential Revision: https://reviews.llvm.org/D21698
llvm-svn: 289979
At least the plugin used by the LibreOffice build
(<https://wiki.documentfoundation.org/Development/Clang_plugins>) indirectly
uses those members (through inline functions in LLVM/Clang include files in turn
using them), but they are not exported by utils/extract_symbols.py on Windows,
and accessing data across DLL/EXE boundaries on Windows is generally
problematic.
Differential Revision: https://reviews.llvm.org/D26671
llvm-svn: 289647
In amdgcn target, null pointers in global, constant, and generic address space take value 0 but null pointers in private and local address space take value -1. Currently LLVM assumes all null pointers take value 0, which results in incorrectly translated IR. To workaround this issue, instead of emit null pointers in local and private address space, a null pointer in generic address space is emitted and casted to local and private address space.
Tentative definition of global variables with non-zero initializer will have weak linkage instead of common linkage since common linkage requires zero initializer and does not have explicit section to hold the non-zero value.
Virtual member functions getNullPointer and performAddrSpaceCast are added to TargetCodeGenInfo which by default returns ConstantPointerNull and emitting addrspacecast instruction. A virtual member function getNullPointerValue is added to TargetInfo which by default returns 0. Each target can override these virtual functions to get target specific null pointer and the null pointer value for specific address space, and perform specific translations for addrspacecast.
Wrapper functions getNullPointer is added to CodegenModule and getTargetNullPointerValue is added to ASTContext to facilitate getting the target specific null pointers and their values.
This change has no effect on other targets except amdgcn target. Other targets can provide support of non-zero null pointer in a similar way.
This change only provides support for non-zero null pointer for C and OpenCL. Supporting for other languages will be added later incrementally.
Differential Revision: https://reviews.llvm.org/D26196
llvm-svn: 289252
This patch is to implement sema and parsing for 'teams distribute parallel for' pragma.
Differential Revision: https://reviews.llvm.org/D27345
llvm-svn: 289179
Revert r288447 which introduced -mdirect. It turns out we don't need a
custom flag for this, as the information we need is in the target triple.
llvm-svn: 288604
This patch is to implement sema and parsing for 'teams distribute parallel for simd' pragma.
Differential Revision: https://reviews.llvm.org/D27084
llvm-svn: 288294
Summary:
This macro should be defined only when the user directly specifies an
API level as part of an Android target. For any regular Android target,
we leave this macro undefined.
Bug: https://llvm.org/bugs/show_bug.cgi?id=30940
Reviewers: eugenis, pirama
Subscribers: tberghammer, cfe-commits, pirama, eugenis, danalbert
Differential Revision: https://reviews.llvm.org/D26491
llvm-svn: 286543
Summary: This is needed to make the v64i8 and v32i16 types legal for the 512-bit VBMI instructions. Fixes PR30912.
Reviewers: delena, zvi
Subscribers: RKSimon, cfe-commits
Differential Revision: https://reviews.llvm.org/D26306
llvm-svn: 286340
Summary:
Bug: https://llvm.org/bugs/show_bug.cgi?id=30940
This macro (along with __ANDROID__) should always be defined for Android
targets. We set it to the major (only) version of the Android API being
compiled for. The Android version is able to be set as an integer suffix
for any valid Android target.
Reviewers: danalbert, eugenis
Subscribers: cfe-commits, pirama, eugenis, tberghammer, danalbert
Differential Revision: https://reviews.llvm.org/D26385
llvm-svn: 286295
which guarantee pointers are not null. These all seem to have useful
properties and correlations to document, in one case we even had it in
a comment but now it will also be an assert.
This should prevent PVS-Studio from incorrectly claiming that there are
a bunch of potential bugs here. But I feel really strongly that the
PVS-Studio warnings that pointed at this code have a far too high
false-positive rate to be entirely useful. These are just places where
there did seem to be a useful invariant to document and verify with an
assert. Several other places in the code were already correct and
already have perfectly clear code documenting and validating their
invariants, but still ran afoul of PVS-Studio.
llvm-svn: 285985
This patch implements the register call calling convention, which ensures
as many values as possible are passed in registers. CodeGen changes
were committed in https://reviews.llvm.org/rL284108.
Differential Revision: https://reviews.llvm.org/D25204
llvm-svn: 285849
Summary:
This patch adds a command line option '-cl-ext' to control a set of
supported OpenCL extensions. Option accepts a comma-separated list
of extensions prefixed with '+' or '-'.
It can be used together with a target triple to override support for some
extensions:
// spir target supports all extensions, but we want to disable fp64
clang -cc1 -triple spir-unknown-unknown -cl-ext=-cl_khr_fp64
Special 'all' extension allows to enable or disable all possible
extensions:
// only fp64 will be supported
clang -cc1 -triple spir-unknown-unknown -cl-ext=-all,+cl_khr_fp64
Patch by asavonic (Andrew Savonichev).
Reviewers: joey, yaxunl
Subscribers: yaxunl, bader, Anastasia, cfe-commits
Differential Revision: https://reviews.llvm.org/D23712
llvm-svn: 285700
Commit on behalf of: Coby Tayree
1.'v' constraint for (x86) non-avx arch imitates the already implemented 'x' constraint, i.e. allows XMM{0-15} & YMM{0-15} depending on the apparent arch & mode (32/64).
2.for the avx512 arch it allows [X,Y,Z]MM{0-31} (mode dependent)
This patch applies the needed changes to clang
LLVM patch: https://reviews.llvm.org/D25005
Differential Revision: https://reviews.llvm.org/D25005
llvm-svn: 285688
Commit on behalf of mharoush
Extending inline assembly support, compatible with GCC as folowing:
"k" constraint hints the compiler to select any of AVX512 k0-k7 registers.
"Yk" constraint is a subset of "k" excluding k0 which is not allowd to be used as a mask.
Reviewer: 1. rnk
Differential Revision: https://reviews.llvm.org/D25063
llvm-svn: 285604
For compatibility with other compilers on the platform, allow specifying
levels of the z/Architecture instead of model names with -march. In
particular, the following aliases are now supported:
-march=arch8 equals -march=z10
-march=arch9 equals -march=z196
-march=arch10 equals -march=zEC12
-march=arch11 equals -march=z13
This parallels the equivalent (and prerequisite) LLVM change in r285577.
llvm-svn: 285578
Commit on behalf of mharoush
After LGTM and check all:
This patch enables usage of k registers in inline assembly syntax.
Adding triple
Reviewer: 1. rnk
2. delena
Differential Revision: https://reviews.llvm.org/D25011
llvm-svn: 285563
Commit on behalf of mharoush
After LGTM and check all:
This patch enables usage of k registers in inline assembly syntax.
Reviewer: 1. rnk
2. delena
Differential Revision: https://reviews.llvm.org/D25011
llvm-svn: 285555
r276653 suppressed the pragma once warning when generating a PCH file.
This patch extends that to any main file for which clang is told (with
the -x option) that it's a header file. It will also suppress the
warning "#include_next in primary source file".
Differential Revision: http://reviews.llvm.org/D25989
llvm-svn: 285295
Fuchsia has experimental support for 32-bit x86 and ARM targets, add
them to the list of supported targets.
Differential Revision: https://reviews.llvm.org/D25838
llvm-svn: 285038
The 'no_undeclared_includes' attribute should be used in a module to
tell that only non-modular headers and headers from used modules are
accepted.
The main motivation behind this is to prevent dep cycles between system
libraries (such as darwin) and libc++.
Patch by Richard Smith!
llvm-svn: 284797
Use the LLVM_CMAKE_PATH variable to locate the GetSVN.cmake script.
The variable was already available in stand-alone builds, and is also
set by LLVM since r284581.
llvm-svn: 284582
Summary: _BitScan intrinsics (and some others, for example _Interlocked and _bittest) are supposed to work on both ARM and x86. This is an attempt to isolate them, avoiding repeating their code or writing separate function for each builtin.
Reviewers: hans, thakis, rnk, majnemer
Subscribers: RKSimon, cfe-commits, aemerson
Differential Revision: https://reviews.llvm.org/D25264
llvm-svn: 284060
Commit in the name of: Coby Tayree
1.'v' constraint for (x86) non-avx arch imitates the already implemented 'x' constraint, i.e. allows XMM{0-15} & YMM{0-15} depending on the apparent arch & mode (32/64).
2.for the avx512 arch it allows [X,Y,Z]MM{0-31} (mode dependent)
This patch applies the needed changes to clang
LLVM patch: https://reviews.llvm.org/D25005
Differential Revision: D25004
llvm-svn: 283716
Provide toolchain and tool support for Fuchsia operating system.
Fuchsia uses compiler-rt as the runtime library and libc++, libc++abi
and libunwind as the C++ standard library. lld is used as a default
linker.
Differential Revision: https://reviews.llvm.org/D25117
llvm-svn: 283420
Summary: We need x86-64-specific builtins if we want to implement some of the MS intrinsics - winnt.h contains definitions of some functions for i386, but not for x86-64 (for example _InterlockedOr64), which means that we cannot treat them as builtins for both i386 and x86-64, because then we have definitions of builtin functions in winnt.h on i386.
Reviewers: thakis, majnemer, hans, rnk
Subscribers: cfe-commits
Differential Revision: https://reviews.llvm.org/D24598
llvm-svn: 283264
Summary:
Also makes -fcoroutines_ts to be both a Driver and CC1 flag.
Patch mostly by EricWF.
Reviewers: rnk, cfe-commits, rsmith, EricWF
Subscribers: mehdi_amini
Differential Revision: https://reviews.llvm.org/D25130
llvm-svn: 283064
assume that ::operator new provides no more alignment than is necessary for any
primitive type, except when we're on a GNU OS, where glibc's malloc guarantees
to provide 64-bit alignment on 32-bit systems and 128-bit alignment on 64-bit
systems. This can be controlled by the command-line -fnew-alignment flag.
llvm-svn: 282974
This patch corresponds to review:
https://reviews.llvm.org/D24397
It adds the __POWER9_VECTOR__ macro and the -mpower9-vector option along with
a number of altivec.h functions (refer to the code review for a list).
llvm-svn: 282481
Like NetBSD, OpenBSD prefers having a consistent set of typedefs
across the architectures it supports over strictly following the ARM
ABIs. The diff below makes sure that clang's view of those types
matches OpenBSD's system header files. It also adds a test that
checks the relevant types on all OpenBSD platforms that clang works
on. Hopefully we can add mips64 and powerpc to that list in the
future.
Patch by Mark Kettenis <mark.kettenis@xs4all.nl>
llvm-svn: 282184
Fix target options for fp32/64-denormals so that
+fp64-denormals is set if fp64 is supported
-fp32-denormals if fp32 denormals is not supported, or -cl-denorms-are-zero is set
+fp32-denormals if fp32 denormals is supported and -cl-denorms-are-zero is not set
If target feature fp32/64-denormals is explicitly set, they will override default options and options deduced from -cl-denorms-are-zero.
Differential Revision: https://reviews.llvm.org/D24512
llvm-svn: 281357
Summary:
This fixes a bug where we were unable to compile the following CUDA
file with libstdc++ (didn't try libc++):
#include <future>
void foo() { std::shared_future<int> x; }
The problem is that <future> only defines std::shared_future if
__GCC_ATOMIC_INT_LOCK_FREE > 1. When we compiled this file for device,
the macro was set to 1, and then the class didn't exist at all.
Reviewers: tra
Subscribers: cfe-commits, jhen
Differential Revision: https://reviews.llvm.org/D24407
llvm-svn: 281089
OpenCL requires __ENDIAN_LITTLE__ be set for little endian targets.
The default for targets was also apparently big endian, so AMDGPU
was incorrectly reported as big endian. Set this from the triple
so targets don't have another place to set the endianness.
llvm-svn: 280787
I'm in the progress of adding ARMv6 support to CloudABI. On the compiler
side, everything seems to work properly with this tiny change applied.
llvm-svn: 280672
This adds support for modules that require (non-)freestanding
environment, such as the compiler builtin mm_malloc submodule.
Differential Revision: https://reviews.llvm.org/D23871
llvm-svn: 280613
Summary:
We want wasm and asmjs to have matching ABIs, and right now asmjs uses
unsigned int for its size_t. This causes exported symbols in libcxx to
not match and can cause weird breakage where libcxx doesn't get linked
as a result. Long-term we probably want wasm32, wasm64, and asmjs to
all use unsigned long, but that would cause unnecessary ABI churn for
asmjs so defer that until we can make all the ABI changes at once.
Patch by Jacob Gravelle
Differential Revision: https://reviews.llvm.org/D24134
llvm-svn: 280420
This adds support for modules that require (no-)gnu-inline-asm
environment, such as the compiler builtin cpuid submodule.
This is the gnu-inline-asm variant of https://reviews.llvm.org/D23871
Differential Revision: https://reviews.llvm.org/D23905
rdar://problem/26931199
llvm-svn: 280159
Pointers of certain GPUs in AMDGCN target in private address space is 32 bit but pointers in other address spaces are 64 bit. size_t type should be defined as 64 bit for these GPUs so that it could hold pointers in all address spaces. Also fixed issues in pointer arithmetic codegen by using pointer specific intptr type.
Differential Revision: https://reviews.llvm.org/D23361
llvm-svn: 279121
This reverts commit r279003 as it breaks some of our buildbots (e.g.
clang-cmake-aarch64-quick, clang-x86_64-linux-selfhost-modules).
The error is in OpenMP/teams_distribute_simd_ast_print.cpp:
clang: /home/buildslave/buildslave/clang-cmake-aarch64-quick/llvm/include/llvm/ADT/DenseMap.h:527:
bool llvm::DenseMapBase<DerivedT, KeyT, ValueT, KeyInfoT, BucketT>::LookupBucketFor(const LookupKeyT&, const BucketT*&) const
[with LookupKeyT = clang::Stmt*; DerivedT = llvm::DenseMap<clang::Stmt*, long unsigned int>;
KeyT = clang::Stmt*; ValueT = long unsigned int;
KeyInfoT = llvm::DenseMapInfo<clang::Stmt*>;
BucketT = llvm::detail::DenseMapPair<clang::Stmt*, long unsigned int>]:
Assertion `!KeyInfoT::isEqual(Val, EmptyKey) && !KeyInfoT::isEqual(Val, TombstoneKey) &&
"Empty/Tombstone value shouldn't be inserted into map!"' failed.
llvm-svn: 279045
This patch is to implement sema and parsing for 'teams distribute simd’ pragma.
This patch is originated by Carlo Bertolli.
Differential Revision: https://reviews.llvm.org/D23528
llvm-svn: 279003
Reapply r278457 with test fixed to not abouse fs case sensitivity.
When the VFS uses a YAML file, the real file path for a
virtual file is described in the "external-contents" field. Example:
...
{
'type': 'file',
'name': 'a.h',
'external-contents': '/a/b/c/a.h'
}
Currently, when parsing umbrella directories, we use
vfs::recursive_directory_iterator to gather the header files to generate the
equivalent modules for. If the external contents for a header does not exist,
we currently are unable to build a module, since the VFS
vfs::recursive_directory_iterator will fail when it finds an entry without a
reliable real path.
Since the YAML file could be prepared ahead of time and shared among
different compiler invocations, an entry might not yet have a reliable
path in 'external-contents', breaking the iteration.
Give the VFS the capability to skip such entries whenever
'ignore-non-existent-contents' property is set in the YAML file.
rdar://problem/27531549
llvm-svn: 278543
When the VFS uses a YAML file, the real file path for a
virtual file is described in the "external-contents" field. Example:
...
{
'type': 'file',
'name': 'a.h',
'external-contents': '/a/b/c/a.h'
}
Currently, when parsing umbrella directories, we use
vfs::recursive_directory_iterator to gather the header files to generate the
equivalent modules for. If the external contents for a header does not exist,
we currently are unable to build a module, since the VFS
vfs::recursive_directory_iterator will fail when it finds an entry without a
reliable real path.
Since the YAML file could be prepared ahead of time and shared among
different compiler invocations, an entry might not yet have a reliable
path in 'external-contents', breaking the iteration.
Give the VFS the capability to skip such entries whenever
'ignore-non-existent-contents' property is set in the YAML file.
rdar://problem/27531549
llvm-svn: 278457
Add 'ignore-non-existent-contents' to tell the VFS whether an invalid path
obtained via 'external-contents' should cause iteration on the VFS to stop.
If 'true', the VFS should ignore the entry and continue with the next. Allows
YAML files to be shared across multiple compiler invocations regardless of
prior existent paths in 'external-contents'. This global value is overridable
on a per-file basis.
This adds the parsing and write test part, but use by VFS comes next.
Differential Revision: https://reviews.llvm.org/D23422
rdar://problem/27531549
llvm-svn: 278456
Adjust target features for amdgcn target when -cl-denorms-are-zero is set.
Denormal support is controlled by feature strings fp32-denormals fp64-denormals in amdgcn target. If -cl-denorms-are-zero is not set and the command line does not set fp32/64-denormals feature string, +fp32-denormals +fp64-denormals will be on for GPU's supporting them.
A new virtual function virtual void TargetInfo::adjustTargetOptions(const CodeGenOptions &CGOpts, TargetOptions &TargetOpts) const is introduced to allow adjusting target option by codegen option.
Differential Revision: https://reviews.llvm.org/D22815
llvm-svn: 278151
The size of image type is reported incorrectly as size of a pointer to address space 0, which causes error when casting image type to pointers by __builtin_astype.
The fix is to get image address space from TargetInfo then report the size accordingly.
Differential Revision: https://reviews.llvm.org/D22927
llvm-svn: 277647
Adding extension cl_khr_mipmap_image to clang's OpenCL Extensions and initiated inside AMDGPU Target.
Patch by Aaron En Ye Shi.
Differential Revision: https://reviews.llvm.org/D22637
llvm-svn: 277181
This resubmit r270688 which broke some specific buildbots.That's because
there is incorrect indexing problem in the targetparser,and the problem is
fixed in r276957.
Differential Revision: https://reviews.llvm.org/D21277
llvm-svn: 276958
Summary:
In RenderScript, the size of the argument or return value emitted in the
IR is expected to be the same as the size of corresponding qualified
type. For ARM and AArch64, the coercion performed by Clang can
change the parameter or return value to a type whose size is different
(usually larger) than the original aggregate type. Specifically, this
can happen in the following cases:
- Aggregate parameters of size <= 64 bytes and return values smaller
than 4 bytes on ARM
- Aggregate parameters and return values smaller than bytes on
AArch64
This patch coerces the cases above to an integer array that is the same
size and alignment as the original aggregate. A new field is added to
TargetInfo to detect a RenderScript target and limit this coercion just
to that case.
Tests added to test/CodeGen/renderscript.c
Reviewers: rsmith
Subscribers: aemerson, srhines, llvm-commits
Differential Revision: https://reviews.llvm.org/D22822
llvm-svn: 276904
Summary:
N32 and N64 follow the standard ELF conventions (.L) whereas O32 uses its own
($).
This fixes the majority of object differences between -fintegrated-as and
-fno-integrated-as.
Reviewers: sdardis
Subscribers: dsanders, sdardis, llvm-commits
Differential Revision: https://reviews.llvm.org/D22412
llvm-svn: 275967
This patch is to implement sema and parsing for 'target parallel for simd' pragma.
Differential Revision: http://reviews.llvm.org/D22096
llvm-svn: 275365
http://reviews.llvm.org/D21904
This patch is similar to the implementation of 'private' clause: it adds a list of private pointers to be used within the target data region to store the device pointers returned by the runtime.
Please refer to the following document for a full description of what the runtime witll return in this case (page 10 and 11):
https://github.com/clang-omp/OffloadingDesign
I am happy to answer any question related to the runtime interface to help reviewing this patch.
llvm-svn: 275271
Some compilers are too dumb to realize that the switch statement covers
all cases.
(Don't use a "default" label, because we explicitly want to get a warning
if our switch doesn't cover all the cases.)
llvm-svn: 274713
Summary:
Currently our handling of CUDA architectures is scattered all around
clang. This patch centralizes it.
A key advantage of this centralization is that you can now write a C++
switch on e.g. CudaArch and get a compile error if you don't handle one
of the enum values.
Reviewers: tra
Subscribers: cfe-commits
Differential Revision: http://reviews.llvm.org/D21867
llvm-svn: 274681
Summary: Also add sm_32, which was missing.
Reviewers: tra
Subscribers: cfe-commits
Differential Revision: http://reviews.llvm.org/D21778
llvm-svn: 274680
Summary: This patch is an implementation of sema and parsing for the OpenMP composite pragma 'distribute simd'.
Differential Revision: http://reviews.llvm.org/D22007
llvm-svn: 274604
Summary: This patch is an implementation of sema and parsing for the OpenMP composite pragma 'distribute parallel for simd'.
Differential Revision: http://reviews.llvm.org/D21977
llvm-svn: 274530
Currently we only have OpenCL 2.0 Builtins i.e. pipes or address space conversions.
They have to be added only in the version 2.0 compilation mode to make the identifiers
available for use in the other versions.
Review: http://reviews.llvm.org/D20249
llvm-svn: 274509
Summary: This change exposes the recently added LEON CPUs (D19359) in the LLVM Sparc backend to Clang, allowing the cpu's to be selected using the -mcpu flag.
Reviewers: jyknight, lero_chris
Subscribers: jyknight, cfe-commits
Differential Revision: http://reviews.llvm.org/D21683
llvm-svn: 274487
Summary:
The TargetInfo for 'renderscript32' and 'renderscript64' ArchTypes are
subclasses of ARMleTargetInfo and AArch64leTargetInfo respectively.
RenderScript32TargetInfo modifies the ARM ABI to set LongWidth and
LongAlign to be 64-bits. Other than this modification, the underlying
TargetInfo base classes is initialized as if they have "armv7" and
"aarch64" architecture type respectively.
Reviewers: rsmith, echristo
Subscribers: aemerson, tberghammer, cfe-commits, danalbert, mehdi_amini, srhines
Differential Revision: http://reviews.llvm.org/D21334
llvm-svn: 274409
Summary:
Summary:
Change Clang calling convention SpirKernel to OpenCLKernel.
Set calling convention OpenCLKernel for amdgcn as well.
Add virtual method .getOpenCLKernelCallingConv() to TargetCodeGenInfo
and use it to set target calling convention for AMDGPU and SPIR.
Update tests.
Reviewers: rsmith, tstellarAMD, Anastasia, yaxunl
Subscribers: kzhuravl, cfe-commits
Differential Revision: http://reviews.llvm.org/D21367
llvm-svn: 274220
We continue accepting "macosx" but canonicalize it to "macos", When emitting
diagnostics, we use "macOS" instead of "OS X".
The PlatformName in TargetInfo is changed from "macosx" to "macos" so we can
directly compare the Platform in AvailabilityAttr with the PlatformName
in TargetInfo.
rdar://26795172
rdar://26800775
llvm-svn: 274064
The final change is required to extend the back-end's AtomicExpandPass that was implemented for Sparc (64 bit) and later extended for Sparc (32 bit).
llvm-svn: 274012
Recently, Microsoft added support for a flag, /std, which controls which
version of the language rules MSVC should use.
MSVC hasn't updated __cplusplus though.
Instead, they added a new macro, _MSVC_LANG, which is defined in a
similar fashion to __cplusplus. This is used to indicate which mode the
compiler is in.
llvm-svn: 273987
[OpenMP] Initial implementation of parse and sema for composite pragma 'distribute parallel for'
This patch is an initial implementation for #distribute parallel for.
The main differences that affect other pragmas are:
The implementation of 'distribute parallel for' requires blocking of the associated loop, where blocks are "distributed" to different teams and iterations within each block are scheduled to parallel threads within each team. To implement blocking, sema creates two additional worksharing directive fields that are used to pass the team assigned block lower and upper bounds through the outlined function resulting from 'parallel'. In this way, scheduling for 'for' to threads can use those bounds.
As a consequence of blocking, the stride of 'distribute' is not 1 but it is equal to the blocking size. This is returned by the runtime and sema prepares a DistIncrExpr variable to hold that value.
As a consequence of blocking, the global upper bound (EnsureUpperBound) expression of the 'for' is not the original loop upper bound (e.g. in for(i = 0 ; i < N; i++) this is 'N') but it is the team-assigned block upper bound. Sema creates a new expression holding the calculation of the actual upper bound for 'for' as UB = min(UB, PrevUB), where UB is the loop upper bound, and PrevUB is the team-assigned block upper bound.
llvm-svn: 273884
I added this option in r257827 to try and add compatibility with autoconf. At the time I misunderstood the problem.
Our CMake automatically generates the SVN revision information and generates a build action to update it so builds don't need to be re-configured on SCM update (which is a better solution than we had in autoconf).
The problem I was actually seeing was isolated cases where SVN revision information isn't available because the repository structures have been removed. This happens in some automated testing systems.
This patch allows SVN_REVISION to be overridden if the build configuration could not find the SCM repository structures, and removes the code from my original patch because it is unnecessary.
llvm-svn: 273714