Summary:
This change re-lands r241621, with an additional fix that was required to allow tool sources to live outside the llvm checkout. It also no longer renames LLVM_EXTERNAL_*_SOURCE_DIR. This change was reverted in r241663, because it renamed several variables of the format LLVM_EXTERNAL_*_* to LLVM_TOOL_*_*.
Original Summary:
The tools CMakeLists file already had implicit tool registration, but there were a few things off about it that needed to be altered to make it work. This change addresses all that. The changes in this patch are:
* factored out canonicalizing tool names from paths to CMake variables * removed the LLVM_IMPLICIT_PROJECT_IGNORE mechanism in favor of LLVM_EXTERNAL_${nameUPPER}_BUILD which I renamed to LLVM_TOOL_${nameUPPER}_BUILD because it applies to internal and external tools
* removed ignore_llvm_tool_subdirectory() in favor of just setting LLVM_TOOL_${nameUPPER}_BUILD to Off
* Added create_llvm_tool_options() to resolve a bug in add_llvm_external_project() - the old LLVM_EXTERNAL_${nameUPPER}_BUILD would not work on a clean CMake directory because the option could be created after it was set in code.
* Removed all but the minimum required calls to add_llvm_external_project from tools/CMakeLists.txt
Differential Revision: http://reviews.llvm.org/D10665
llvm-svn: 242059
- Changed driver pipeline to compile host and device side of CUDA
files and incorporate results of device-side compilation into host
object file.
- Added a test for cuda pipeline creation in clang driver.
New clang options:
--cuda-host-only - Do host-side compilation only.
--cuda-device-only - Do device-side compilation only.
--cuda-gpu-arch=<ARCH> - specify GPU architecture for device-side
compilation. E.g. sm_35, sm_30. Default is sm_20. May be used more
than once in which case one device-compilation will be done per
unique specified GPU architecture.
Differential Revision: http://reviews.llvm.org/D9509
llvm-svn: 242058
And make the module unavailable without breaking any parent modules.
If there's a missing requirement after we've already seen a missing
header, still update the IsMissingRequiement bit correctly. Also,
diagnose missing requirements before missing headers, since the
existence of the header is moot if there are missing requirements.
llvm-svn: 242055
1.) in kmp_csupport.c, move computation of parameters only needed for OMPT tracing
inside a conditional to reduce overhead if not receiving ompt_event_master_begin
callbacks.
2.) in kmp_gsupport.c, remove spurious reset of OMPT reenter_runtime_frame (which
is set in its caller, GOMP_parallel_start correct placement of #if OMP_TRACE so
that state is maintained even if tracing support not included.
3.) in z_Linux_util.c, add architecture independent support for OMPT by setting
and resetting OMPT's exit_frame_ptr before and after invoking a microtask.
4.) On the Intel MIC, the loader refuses to retain static symbols in the
libomp.so shared library, even though tools need them. The loader could not be
bullied into doing so. To accommodate this, I changed the visibility of OMPT
placeholder functions to public. This required additions in exports.so.txt,
adding extern "C" scoping in ompt-general.c so that the public placeholder
symbols won't be mangled.
Patch by John Mellor-Crummey
Differential Revision: http://reviews.llvm.org/D11062
llvm-svn: 242052
It always takes me a while to figure out how to say "preprocess to file
foo.txt" with clang-cl. With this, it might be easier.
http://reviews.llvm.org/D10890
llvm-svn: 242051
Enable partial and runtime loop unrolling for NVPTX backend via
TTI::UnrollingPreferences with a small threshold. This partially unrolls
small loops which are often unrolled by the PTX to SASS compiler
and unrolling earlier can be beneficial.
llvm-svn: 242049
This change updates the documentation for the loop unrolling pragma behavior
change in r242047. Specifically, with that change "#pragma unroll" will not
unroll loops with a runtime trip count.
llvm-svn: 242048
Enable runtime unrolling for loops with unroll count metadata ("#pragma unroll N")
and a runtime trip count. Also, do not unroll loops with unroll full metadata if the
loop has a runtime loop count. Previously, such loops would be unrolled with a
very large threshold (pragma-unroll-threshold) if runtime unrolled happened to be
enabled resulting in a very large (and likely unwise) unroll factor.
llvm-svn: 242047
This commit serializes the fixed stack objects, including fixed spill slots.
The fixed stack objects are serialized using a YAML sequence of YAML inline
mappings. Each mapping has the object's ID, type, size, offset, and alignment.
The objects that aren't spill slots also serialize the isImmutable and isAliased
flags.
The fixed stack objects are a part of the machine function's YAML mapping.
Reviewers: Duncan P. N. Exon Smith
llvm-svn: 242045
Passes should never modify it, just use the const version. While there
reduce copying in LoopInterchange. No functional change intended.
llvm-svn: 242041
It had accidently accepted a symbol+offset value (and emitted
incorrect code for it, keeping only the offset part) instead of
properly reporting the constraint as invalid.
Differential Revision: http://reviews.llvm.org/D11039
llvm-svn: 242040
The two-address instruction pass will convert these back to v_mad_f32
if necessary.
Differential Revision: http://reviews.llvm.org/D11060
llvm-svn: 242038
The 64/128-bit vector types are legal if NEON instructions are
available. However, there was no matching patterns for @llvm.cttz.*()
intrinsics and result in fatal error.
This commit fixes the problem by lowering cttz to:
a. ctpop((x & -x) - 1)
b. width - ctlz(x & -x) - 1
llvm-svn: 242037
Summary:
The iteration order within a member of DepCands is deterministic
and therefore we don't have to sort the accesses within a member.
We also don't have to copy the indices of the pointers into a
vector, since we can iterate over the members of the class.
Subscribers: llvm-commits
Differential Revision: http://reviews.llvm.org/D11145
llvm-svn: 242033
Attribute names usually support an alternate spelling that uses double
underscores before and after the attribute name, like e.g. attribute
((__aligned__)) for attribute ((aligned)). This is necessary to allow
use of attributes in system headers without polluting the name space.
However, for attribute ((enable_if)) that alternate spelling does not
work correctly. This is because of code in Parser::ParseGNUAttributeArgs
(ParseDecl.cpp) that specifically checks for the "enable_if" spelling
without allowing the alternate spelling.
Similar code in ParseDecl.cpp uses the normalizeAttrName helper to allow
both spellings. This patch adds use of that helper for the "enable_if"
check as well, which fixes attribute ((__enable_if__)).
Differential Revision: http://reviews.llvm.org/D11142
llvm-svn: 242029
In this patch I have only encoding. Intrinsics and DAG lowering will be in the next patch.
I temporary removed the old intrinsics test (just to split this patch).
Half types are not covered here.
Differential Revision: http://reviews.llvm.org/D11134
llvm-svn: 242023
Summary:
r241964 has added a dependency on uuid.h, which (on linux at least) necessitates instalation of a
new package. Since the only thing we need from that file is uuid_t (and this is already defined
in UuidCompatibility.h), we can avoid this dependency by making this include __APPLE__ specific.
If in future, we need more from this library, we can revisit this decision.
Reviewers: jasonmolenda
Subscribers: lldb-commits
Differential Revision: http://reviews.llvm.org/D11135
llvm-svn: 242022
The size of a long double was hardcoded in DataExtractor for x86 and
x86_64 architectures. This CL removes the hard coded values and use the
actual size based on the floating point semantics specified.
Differential revision: http://reviews.llvm.org/D8417
llvm-svn: 242019
Summary:
This is the first part of our effort to make llgs single threaded. Currently, llgs consists of
about three threads and the synchronisation between them is a major source of latency when
debugging linux and android applications.
In order to be able to go single threaded, we must have the ability to listen for events from
multiple sources (primarily, client commands coming over the network and debug events from the
inferior) and perform necessary actions. For this reason I introduce the concept of a MainLoop.
A main loop has the ability to register callback's which will be invoked upon receipt of certain
events. MainLoopPosix has the ability to listen for file descriptors and signals.
For the moment, I have merely made the GDBRemoteCommunicationServerLLGS class use MainLoop
instead of waiting on the network socket directly, but the other threads still remain. In the
followup patches I indend to migrate NativeProcessLinux to this class and remove the remaining
threads.
Reviewers: ovyalov, clayborg, amccarth, zturner, emaste
Subscribers: tberghammer, lldb-commits
Differential Revision: http://reviews.llvm.org/D11066
llvm-svn: 242018
On Android the oatdata and the oatexec symbols in
system@framework@boot.oat covers the full .text section what causes
issues with displaying unusable symbol name to the user and very slow
unwinding speed because the instruction emulation based unwind plans
try to emulate all instructions in these symbols. Don't add these
symbols to the symbol list as they have no use for the debugger and
they are causing a lot of trouble.
Differential revision: http://reviews.llvm.org/D11065
llvm-svn: 242017
This patch:
- Allows mips32 cores to match with any mips32/mips64 cores.
- Allows mips32r2 cores to match with core only up-to mips32r2/mips64r2.
- Allows mips32r3 cores to match with core only up-to mips32r3/mips64r3.
- Allows mips32r5 cores to match with core only up-to mips32r3/mips64r5.
- Allows mips32r6 core to match with only mips32r6/mips64r6 or mips32/mips64.
Reviewers: emaste, jaydeep, clayborg
Subscribers: mohit.bhakkad, nitesh.jain, bhushan, lldb-commits
Differential Revision: http://reviews.llvm.org/D10921
llvm-svn: 242016
It exists for compatibility with GCC which requires it to print MSA registers
for the 'f' constraint. Although LLVM doesn't need it, the 'w' modifier should
still be used for portability between the two compilers.
llvm-svn: 242015
Summary:
This at least saves compile time. I also encountered a case where
ephemeral values affect whether other variables are promoted, causing
performance issues. It may be a bug in LSR, but I didn't manage to
reduce it yet. Anyhow, I believe it's in general not worth considering
ephemeral values in LSR.
Reviewers: atrick, hfinkel
Subscribers: llvm-commits
Differential Revision: http://reviews.llvm.org/D11115
llvm-svn: 242011
Three things:
- The atomic intrinsics mandate memory barriers, let's start emitting
some.
- We don't need to manually create RMW operations, we can just do
__atomic_fetch_foo instead of performing __atomic_foo_fetch and
undoing foo.
- Don't use inline assembly, we don't need it for these intrinsics.
This fixes PR24101.
llvm-svn: 242009