Value, type, and instantiation dependence were not being handled
correctly for CUDAKernelCallExpr AST nodes. As a result, if an
undeclared identifier was used in the triple-angle-bracket kernel call
configuration, there would be no error during parsing, and there would
be a crash during code gen. This patch makes sure that an error will be
issued during parsing in this case, just as there would be for any other
use of an undeclared identifier in C++.
Patch by Jason Henline.
Reviewers: jlebar, rsmith
Differential Revision: http://reviews.llvm.org/D15858
llvm-svn: 257839
Modify the TSTiterator to have two internal iterators, which will walk
the provided sugared type and the desugared type. This will provide better
access to the template argument information. No functional changes.
llvm-svn: 257838
This patch makes use of the handleLoadedFile hook added in r257814.
That method is used to check the arch and the OS of the files we are linking
against the arch and OS on the context.
The first test to use this ensures that we do not try to combine i386 Mac OS code
with i386 simulator code.
llvm-svn: 257837
The release builds are configured to be reproducible, so that the
binaries compare equal between bootstrap iterations. The OpenMP
run-time build was failing like this:
runtime/src/kmp_version.c:108:79: error: expansion of date or time macro is not reproducible [-Werror,-Wdate-time]
char const __kmp_version_build_time[] = KMP_VERSION_PREFIX "build time: " __DATE__ " " __TIME__;
Figuring as the build currently doesn't set LIBOMP_DATE, it's probably
OK to skip setting the build time here too.
llvm-svn: 257833
1) Instead of using pairs of From/To* fields, combine fields into a struct
TemplateArgInfo and have two in each DiffNode.
2) Use default initialization in DiffNode so that the constructor shows the
only field that is initialized differently on construction.
3) Use Set and Get functions per each DiffKind to make sure all fields for the
diff is set. In one case, the Expr fields were not set.
4) Don't print boolean literals for boolean template arguments. This prevents
printing 'false aka 0'
Only #3 has a functional change, which is reflected in the test change.
llvm-svn: 257831
This is to enable isa<> support for any files which need it.
It will be used in an upcoming patch to differentiate MachOFile from other implicitly generated files.
Reviewed by Lang Hames.
Differential Revision: http://reviews.llvm.org/D16103
llvm-svn: 257830
Autoconf does this in the GetRepositoryPath script, CMake's VersionFromVCS does grab the SVN_REVISION, but doesn't populate the repository URL.
llvm-svn: 257826
Summary:
Before this the Verifier didn't complain if the GlobalVariable
referenced from a DIGlobalVariable was not in fact in the correct
module (it would crash while writing bitcode though). Fix this by
always checking parantage of GlobalValues while walking constant
expressions and changing the DIGlobalVariable visitor to also
visit the constant it contains.
Reviewers: rafael
Differential Revision: http://reviews.llvm.org/D16059
llvm-svn: 257825
Summary:
We already have the inverse verification that we only use globals
that are defined in this module. This essentially catches the
same mistake, but when verifying the module that contains the
definition.
Reviewers: rafael
Differential Revision: http://reviews.llvm.org/D15272
llvm-svn: 257823
classes.
OrcRemoteTargetClient::RCMemoryManager will now register EH frames with the
server automatically. This allows remote-execution of code that uses exceptions.
llvm-svn: 257816
This is called from the resolver on each file we decide we actually want to use.
Future commits will make use of this to extract useful information from the files and do
error checking against the context. For example, ensure that files are the same arch as
each other.
Reviewed by Lang Hames.
Differential Revision: http://reviews.llvm.org/D16093
llvm-svn: 257814
If your program refers to modules (as indicated in DWARF) we will now try to
load these modules and give you access to their types in expressions. This used
to be gated by a setting ("settings set target.auto-import-clang-modules true")
but that setting defaulted to false. Now it defaults to true -- but you can
disable it by toggling the setting to false.
llvm-svn: 257812
Summary:
Previously we compiled CUDA device code to PTX assembly and embedded
that asm as text in our host binary. Now we compile to PTX assembly and
then invoke ptxas to assemble the PTX into a cubin file. We gather the
ptx and cubin files for each of our --cuda-gpu-archs and combine them
using fatbinary, and then embed that into the host binary.
Adds two new command-line flags, -Xcuda_ptxas and -Xcuda_fatbinary,
which pass args down to the external tools.
Reviewers: tra, echristo
Subscribers: cfe-commits, jhen
Differential Revision: http://reviews.llvm.org/D16082
llvm-svn: 257809
Summary:
Right now if the Action graph is a DAG and we encounter an action twice,
we will run it twice.
This patch is difficult to test as-is, but I have testcases for this as
used within CUDA compilation.
Reviewers:
Subscribers:
llvm-svn: 257808
This patch seeds the SLP vectorizer with getelementptr indices. The primary
motivation in doing so is to vectorize gather-like idioms beginning with
consecutive loads (e.g., g[a[0] - b[0]] + g[a[1] - b[1]] + ...). While these
cases could be vectorized with a top-down phase, seeding the existing bottom-up
phase with the index computations avoids the complexity, compile-time, and
phase ordering issues associated with a full top-down pass. Only bundles of
single-index getelementptrs with non-constant differences are considered for
vectorization.
Differential Revision: http://reviews.llvm.org/D14829
llvm-svn: 257800
Rounding up an integer m to a nearest multiple of n where n is a power
of 2 is used very often if you are writing code to emit binary files.
RoundUpToAlignment is a small function to do that. But we found that the
function has a small but annoying issue; the name is a bit too long.
Because it is used quite often, that hurts readability.
This patch is to rename the function. The original name is kept as a
forwarder, so that submitting this patch won't immediately break Clang
and other LLVM projects. Once I update all occurrences of RoundUpToAlignment,
I'll remove the old name entirely.
http://reviews.llvm.org/D16162
llvm-svn: 257799
MIPS ABI has relocations like R_MIPS_JALR which is just a hint for
linker to make some code optimization. Such relocations should not be
handled as a regular ones and lead to say dynamic relocation creation.
The patch introduces new virtual `Target::isHintReloc` method, overrides
it in the `MipsTargetInfo` class and calls it in the `Writer<ELFT>::scanRelocs`
method.
Differential Revision: http://reviews.llvm.org/D16193
llvm-svn: 257798
Summary: If SROA creates only one piece (e.g. because the other is not needed),
it still needs to create a bit_piece expression if that bit piece is smaller
than the original size of the alloca.
Reviewers: aprantl
Subscribers: llvm-commits
Differential Revision: http://reviews.llvm.org/D16187
llvm-svn: 257795
Since r230276, we support an improved legalization for f64->f16,
which goes through a temporary f32, improving codegen when
f32->f16 is legal but not f64->f16. This requires unsafe-fp-math.
However, that legalization assumed that the second step, producing
a pseudo-softened f16, had type i16. That's not true on targets
with illegal i16, such as ARM.
Use the initial f64->f16 result type instead.
llvm-svn: 257794
This flag allows to disable old way of determining dynamic TLS by
filtering out allocations from dynamic linker. This will be eventually
superseded by __tls_get_addr interceptor (see r257785), after we:
1) Test it in several supported environments
2) Deal with existing problems (currently we can't find a pointer to
DTV which is calloc()-ed in pthread_create).
llvm-svn: 257789